class String

AString object has an arbitrary sequence of bytes, typically representing text or binary data. AString object may be created usingString::new or as literals.

String objects differ fromSymbol objects in thatSymbol objects are designed to be used as identifiers, instead of text or data.

You can create aString object explicitly with:

You can convert certain objects to Strings with:

MethodString.

SomeString methods modifyself. Typically, a method whose name ends with! modifiesself and returnsself; often, a similarly named method (without the!) returns a new string.

In general, if both bang and non-bang versions of a method exist, the bang method mutates and the non-bang method does not. However, a method without a bang can also mutate, such asString#replace.

Substitution Methods¶↑

These methods perform substitutions:

String#sub: One substitution (or none); returns a new string.
String#sub!: One substitution (or none); returnsself if any changes,nil otherwise.
String#gsub: Zero or more substitutions; returns a new string.
String#gsub!: Zero or more substitutions; returnsself if any changes,nil otherwise.

Each of these methods takes:

A first argument,pattern (String orRegexp), that specifies the substring(s) to be replaced.
Either of the following:
- A second argument,replacement (String orHash), that determines the replacing string.
- A block that will determine the replacing string.

The examples in this section mostly use theString#sub andString#gsub methods; the principles illustrated apply to all four substitution methods.

Argumentpattern

Argumentpattern is commonly a regular expression:

s ='hello's.sub(/[aeiou]/,'*')# => "h*llo"s.gsub(/[aeiou]/,'*')# => "h*ll*"s.gsub(/[aeiou]/,'')# => "hll"s.sub(/ell/,'al')# => "halo"s.gsub(/xyzzy/,'*')# => "hello"'THX1138'.gsub(/\d+/,'00')# => "THX00"

Whenpattern is a string, all its characters are treated as ordinary characters (not asRegexp special characters):

'THX1138'.gsub('\d+','00')# => "THX1138"

Stringreplacement

Ifreplacement is a string, that string determines the replacing string that is substituted for the matched text.

Each of the examples above uses a simple string as the replacing string.

Stringreplacement may contain back-references to the pattern’s captures:

\n (n is a non-negative integer) refers to$n.
\k<name> refers to the named capturename.

SeeRegexp for details.

Note that within the stringreplacement, a character combination such as$& is treated as ordinary text, not as a special match variable. However, you may refer to some special match variables using these combinations:

\& and\0 correspond to$&, which contains the complete matched text.
\' corresponds to$', which contains the string after the match.
\` corresponds to$`, which contains the string before the match.
\+ corresponds to$+, which contains the last capture group.

SeeRegexp for details.

Note that\\ is interpreted as an escape, i.e., a single backslash.

Note also that a string literal consumes backslashes. SeeString Literals for details about string literals.

A back-reference is typically preceded by an additional backslash. For example, if you want to write a back-reference\& inreplacement with a double-quoted string literal, you need to write"..\\&..".

If you want to write a non-back-reference string\& inreplacement, you need to first escape the backslash to prevent this method from interpreting it as a back-reference, and then you need to escape the backslashes again to prevent a string literal from consuming them:"..\\\\&..".

You may want to use the block form to avoid excessive backslashes.

Hashreplacement

If the argumentreplacement is a hash, andpattern matches one of its keys, the replacing string is the value for that key:

h = {'foo'=>'bar','baz'=>'bat'}'food'.sub('foo',h)# => "bard"

Note that a symbol key does not match:

h = {foo:'bar',baz:'bat'}'food'.sub('foo',h)# => "d"

Block

In the block form, the current match string is passed to the block; the block’s return value becomes the replacing string:

s ='@''1234'.gsub(/\d/) {|match|s.succ! }# => "ABCD"

Special match variables such as$1,$2,$`,$&, and$' are set appropriately.

Whitespace in Strings¶↑

In the classString,whitespace is defined as a contiguous sequence of characters consisting of any mixture of the following:

NL (null):"\x00","\u0000".
HT (horizontal tab):"\x09","\t".
LF (line feed):"\x0a","\n".
VT (vertical tab):"\x0b","\v".
FF (form feed):"\x0c","\f".
CR (carriage return):"\x0d","\r".
SP (space):"\x20"," ".

Whitespace is relevant for the following methods:

lstrip,lstrip!: Strip leading whitespace.
rstrip,rstrip!: Strip trailing whitespace.
strip,strip!: Strip leading and trailing whitespace.

What’s Here¶↑

First, what’s elsewhere. ClassString:

Inherits from theObject class.
Includes theComparable module.

Here, classString provides methods that are useful for:

Creating a String¶↑

::new: Returns a new string.
::try_convert: Returns a new string created from a given object.

Freezing/Unfreezing¶↑

+@: Returns a string that is not frozen:self if not frozen;self.dup otherwise.
-@ (aliased asdedup): Returns a string that is frozen:self if already frozen;self.freeze otherwise.
freeze: Freezesself if not already frozen; returnsself.

Querying¶↑

Counts

length (aliased assize): Returns the count of characters (not bytes).
empty?: Returns whether the length ofself is zero.
bytesize: Returns the count of bytes.
count: Returns the count of substrings matching given strings.

Substrings

=~: Returns the index of the first substring that matches a givenRegexp or other object; returnsnil if no match is found.
byteindex: Returns the byte index of the first occurrence of a given substring.
byterindex: Returns the byte index of the last occurrence of a given substring.
index: Returns the index of thefirst occurrence of a given substring; returnsnil if none found.
rindex: Returns the index of thelast occurrence of a given substring; returnsnil if none found.
include?: Returnstrue if the string contains a given substring;false otherwise.
match: Returns aMatchData object if the string matches a givenRegexp;nil otherwise.
match?: Returnstrue if the string matches a givenRegexp;false otherwise.
start_with?: Returnstrue if the string begins with any of the given substrings.
end_with?: Returnstrue if the string ends with any of the given substrings.

Encodings

encoding: Returns theEncoding object that represents the encoding of the string.
unicode_normalized?: Returnstrue if the string is in Unicode normalized form;false otherwise.
valid_encoding?: Returnstrue if the string contains only characters that are valid for its encoding.
ascii_only?: Returnstrue if the string has only ASCII characters;false otherwise.

Other

sum: Returns a basic checksum for the string: the sum of each byte.
hash: Returns the integer hash code.

Comparing¶↑

== (aliased as===): Returnstrue if a given other string has the same content asself.
eql?: Returnstrue if the content is the same as the given other string.
<=>: Returns -1, 0, or 1 as a given other string is smaller than, equal to, or larger thanself.
casecmp: Ignoring case, returns -1, 0, or 1 asself is smaller than, equal to, or larger than a given other string.
casecmp?: Ignoring case, returns whether a given other string is equal toself.

Modifying¶↑

Each of these methods modifiesself.

Insertion

insert: Returnsself with a given string inserted at a specified offset.
<<: Returnsself concatenated with a given string or integer.
append_as_bytes: Returnsself concatenated with strings without performing any encoding validation or conversion.
prepend: Prefixes toself the concatenation of given other strings.

Substitution

bytesplice: Replaces bytes ofself with bytes from a given string; returnsself.
sub!: Replaces the first substring that matches a given pattern with a given replacement string; returnsself if any changes,nil otherwise.
gsub!: Replaces each substring that matches a given pattern with a given replacement string; returnsself if any changes,nil otherwise.
succ! (aliased asnext!): Returnsself modified to become its own successor.
replace: Returnsself with its entire content replaced by a given string.
reverse!: Returnsself with its characters in reverse order.
setbyte: Sets the byte at a given integer offset to a given value; returns the argument.
tr!: Replaces specified characters inself with specified replacement characters; returnsself if any changes,nil otherwise.
tr_s!: Replaces specified characters inself with specified replacement characters, removing duplicates from the substrings that were modified; returnsself if any changes,nil otherwise.

Casing

capitalize!: Upcases the initial character and downcases all others; returnsself if any changes,nil otherwise.
downcase!: Downcases all characters; returnsself if any changes,nil otherwise.
upcase!: Upcases all characters; returnsself if any changes,nil otherwise.
swapcase!: Upcases each downcase character and downcases each upcase character; returnsself if any changes,nil otherwise.

Encoding

encode!: Returnsself with all characters transcoded from one encoding to another.
unicode_normalize!: Unicode-normalizesself; returnsself.
scrub!: Replaces each invalid byte with a given character; returnsself.
force_encoding: Changes the encoding to a given encoding; returnsself.

Deletion

clear: Removes all content, so thatself is empty; returnsself.
slice!,[]=: Removes a substring determined by a given index, start/length, range, regexp, or substring.
squeeze!: Removes contiguous duplicate characters; returnsself.
delete!: Removes characters as determined by the intersection of substring arguments.
delete_prefix!: Removes leading prefix; returnsself if any changes,nil otherwise.
delete_suffix!: Removes trailing suffix; returnsself if any changes,nil otherwise.
lstrip!: Removes leading whitespace; returnsself if any changes,nil otherwise.
rstrip!: Removes trailing whitespace; returnsself if any changes,nil otherwise.
strip!: Removes leading and trailing whitespace; returnsself if any changes,nil otherwise.
chomp!: Removes the trailing record separator, if found; returnsself if any changes,nil otherwise.
chop!: Removes trailing newline characters if found; otherwise removes the last character; returnsself if any changes,nil otherwise.

Converting to New String¶↑

Each of these methods returns a newString based onself, often just a modified copy ofself.

Extension

*: Returns the concatenation of multiple copies ofself.
+: Returns the concatenation ofself and a given other string.
center: Returns a copy ofself, centered by specified padding.
concat: Returns the concatenation ofself with given other strings.
ljust: Returns a copy ofself of a given length, right-padded with a given other string.
rjust: Returns a copy ofself of a given length, left-padded with a given other string.

Encoding

b: Returns a copy ofself with ASCII-8BIT encoding.
scrub: Returns a copy ofself with each invalid byte replaced with a given character.
unicode_normalize: Returns a copy ofself with each character Unicode-normalized.
encode: Returns a copy ofself with all characters transcoded from one encoding to another.

Substitution

dump: Returns a printable version ofself, enclosed in double-quotes.
undump: Returns a copy ofself with all\xNN notations replaced by\uNNNN notations and all escaped characters unescaped.
sub: Returns a copy ofself with the first substring matching a given pattern replaced with a given replacement string.
gsub: Returns a copy ofself with each substring that matches a given pattern replaced with a given replacement string.
succ (aliased asnext): Returns the string that is the successor toself.
reverse: Returns a copy ofself with its characters in reverse order.
tr: Returns a copy ofself with specified characters replaced with specified replacement characters.
tr_s: Returns a copy ofself with specified characters replaced with specified replacement characters, removing duplicates from the substrings that were modified.
%: Returns the string resulting from formatting a given object intoself.

Casing

capitalize: Returns a copy ofself with the first character upcased and all other characters downcased.
downcase: Returns a copy ofself with all characters downcased.
upcase: Returns a copy ofself with all characters upcased.
swapcase: Returns a copy ofself with all upcase characters downcased and all downcase characters upcased.

Deletion

delete: Returns a copy ofself with characters removed.
delete_prefix: Returns a copy ofself with a given prefix removed.
delete_suffix: Returns a copy ofself with a given suffix removed.
lstrip: Returns a copy ofself with leading whitespace removed.
rstrip: Returns a copy ofself with trailing whitespace removed.
strip: Returns a copy ofself with leading and trailing whitespace removed.
chomp: Returns a copy ofself with a trailing record separator removed, if found.
chop: Returns a copy ofself with trailing newline characters or the last character removed.
squeeze: Returns a copy ofself with contiguous duplicate characters removed.
[] (aliased asslice): Returns a substring determined by a given index, start/length, range, regexp, or string.
byteslice: Returns a substring determined by a given index, start/length, or range.
chr: Returns the first character.

Duplication

to_s (aliased asto_str): Ifself is a subclass ofString, returnsself copied into aString; otherwise, returnsself.

Converting to Non-String¶↑

Each of these methods converts the contents ofself to a non-String.

Characters, Bytes, and Clusters

bytes: Returns an array of the bytes inself.
chars: Returns an array of the characters inself.
codepoints: Returns an array of the integer ordinals inself.
getbyte: Returns the integer byte at the given index inself.
grapheme_clusters: Returns an array of the grapheme clusters inself.

Splitting

lines: Returns an array of the lines inself, as determined by a given record separator.
partition: Returns a 3-element array determined by the first substring that matches a given substring or regexp.
rpartition: Returns a 3-element array determined by the last substring that matches a given substring or regexp.
split: Returns an array of substrings determined by a given delimiter – regexp or string – or, if a block is given, passes those substrings to the block.

Matching

scan: Returns an array of substrings matching a given regexp or string, or, if a block is given, passes each matching substring to the block.
unpack: Returns an array of substrings extracted fromself according to a given format.
unpack1: Returns the first substring extracted fromself according to a given format.

Numerics

hex: Returns the integer value of the leading characters, interpreted as hexadecimal digits.
oct: Returns the integer value of the leading characters, interpreted as octal digits.
ord: Returns the integer ordinal of the first character inself.
to_i: Returns the integer value of leading characters, interpreted as an integer.
to_f: Returns the floating-point value of leading characters, interpreted as a floating-point number.

Strings and Symbols

inspect: Returns a copy ofself, enclosed in double quotes, with special characters escaped.
intern (aliased asto_sym): Returns the symbol corresponding toself.

Iterating¶↑

each_byte: Calls the given block with each successive byte inself.
each_char: Calls the given block with each successive character inself.
each_codepoint: Calls the given block with each successive integer codepoint inself.
each_grapheme_cluster: Calls the given block with each successive grapheme cluster inself.
each_line: Calls the given block with each successive line inself, as determined by a given record separator.
upto: Calls the given block with each string value returned by successive calls tosucc.

Public Class Methods

json_create(o)

Source

# File ext/json/lib/json/add/string.rb, line 11defself.json_create(object)object["raw"].pack("C*")end

Raw Strings areJSON Objects (the raw bytes are stored in an array for the key “raw”). The RubyString can be created by this class method.

new(string = ''.encode(Encoding::ASCII_8BIT) , **options) → new_string

Source

static VALUErb_str_init(int argc, VALUE *argv, VALUE str){    static ID keyword_ids[2];    VALUE orig, opt, venc, vcapa;    VALUE kwargs[2];    rb_encoding *enc = 0;    int n;    if (!keyword_ids[0]) {        keyword_ids[0] = rb_id_encoding();        CONST_ID(keyword_ids[1], "capacity");    }    n = rb_scan_args(argc, argv, "01:", &orig, &opt);    if (!NIL_P(opt)) {        rb_get_kwargs(opt, keyword_ids, 0, 2, kwargs);        venc = kwargs[0];        vcapa = kwargs[1];        if (!UNDEF_P(venc) && !NIL_P(venc)) {            enc = rb_to_encoding(venc);        }        if (!UNDEF_P(vcapa) && !NIL_P(vcapa)) {            long capa = NUM2LONG(vcapa);            long len = 0;            int termlen = enc ? rb_enc_mbminlen(enc) : 1;            if (capa < STR_BUF_MIN_SIZE) {                capa = STR_BUF_MIN_SIZE;            }            if (n == 1) {                StringValue(orig);                len = RSTRING_LEN(orig);                if (capa < len) {                    capa = len;                }                if (orig == str) n = 0;            }            str_modifiable(str);            if (STR_EMBED_P(str) || FL_TEST(str, STR_SHARED|STR_NOFREE)) {                /* make noembed always */                const size_t size = (size_t)capa + termlen;                const char *const old_ptr = RSTRING_PTR(str);                const size_t osize = RSTRING_LEN(str) + TERM_LEN(str);                char *new_ptr = ALLOC_N(char, size);                if (STR_EMBED_P(str)) RUBY_ASSERT((long)osize <= str_embed_capa(str));                memcpy(new_ptr, old_ptr, osize < size ? osize : size);                FL_UNSET_RAW(str, STR_SHARED|STR_NOFREE);                RSTRING(str)->as.heap.ptr = new_ptr;            }            else if (STR_HEAP_SIZE(str) != (size_t)capa + termlen) {                SIZED_REALLOC_N(RSTRING(str)->as.heap.ptr, char,                        (size_t)capa + termlen, STR_HEAP_SIZE(str));            }            STR_SET_LEN(str, len);            TERM_FILL(&RSTRING(str)->as.heap.ptr[len], termlen);            if (n == 1) {                memcpy(RSTRING(str)->as.heap.ptr, RSTRING_PTR(orig), len);                rb_enc_cr_str_exact_copy(str, orig);            }            FL_SET(str, STR_NOEMBED);            RSTRING(str)->as.heap.aux.capa = capa;        }        else if (n == 1) {            rb_str_replace(str, orig);        }        if (enc) {            rb_enc_associate(str, enc);            ENC_CODERANGE_CLEAR(str);        }    }    else if (n == 1) {        rb_str_replace(str, orig);    }    return str;}

Returns a new String object containing the givenstring.

Theoptions are optional keyword options (see below).

With no argument given and keywordencoding also not given, returns an empty string with theEncodingASCII-8BIT:

s =String.new# => ""s.encoding# => #<Encoding:ASCII-8BIT>

With argumentstring given and keyword optionencoding not given, returns a new string with the same encoding asstring:

s0 ='foo'.encode(Encoding::UTF_16)s1 =String.new(s0)s1.encoding# => #<Encoding:UTF-16 (dummy)>

(Unlike String.new, astring literal like'' or ahere document literal always hasscript encoding.)

With keyword optionencoding given, returns a string with the specified encoding; theencoding may be anEncoding object, an encoding name, or an encoding name alias:

String.new(encoding:Encoding::US_ASCII).encoding# => #<Encoding:US-ASCII>String.new('',encoding:Encoding::US_ASCII).encoding# => #<Encoding:US-ASCII>String.new('foo',encoding:Encoding::US_ASCII).encoding# => #<Encoding:US-ASCII>String.new('foo',encoding:'US-ASCII').encoding# => #<Encoding:US-ASCII>String.new('foo',encoding:'ASCII').encoding# => #<Encoding:US-ASCII>

The given encoding need not be valid for the string’s content, and its validity is not checked:

s =String.new('こんにちは',encoding:'ascii')s.valid_encoding?# => false

But the givenencoding itself is checked:

String.new('foo',encoding:'bar')# Raises ArgumentError.

With keyword optioncapacity given, the given value is advisory only, and may or may not set the size of the internal buffer, which may in turn affect performance:

String.new('foo',capacity:1)# Buffer size is at least 4 (includes terminal null byte).String.new('foo',capacity:4096)# Buffer size is at least 4;# may be equal to, greater than, or less than 4096.

try_convert(object) → object, new_string, or nil

Source

static VALUErb_str_s_try_convert(VALUE dummy, VALUE str){    return rb_check_string_type(str);}

Attempts to convert the givenobject to a string.

Ifobject is already a string, returnsobject, unmodified.

Otherwise ifobject responds to:to_str, callsobject.to_str and returns the result.

Returnsnil ifobject does not respond to:to_str.

Raises an exception unlessobject.to_str returns a string.

Public Instance Methods

self % object → new_string

Source

static VALUErb_str_format_m(VALUE str, VALUE arg){    VALUE tmp = rb_check_array_type(arg);    if (!NIL_P(tmp)) {        return rb_str_format(RARRAY_LENINT(tmp), RARRAY_CONST_PTR(tmp), str);    }    return rb_str_format(1, &arg, str);}

Returns the result of formattingobject into the format specifications contained inself (seeFormat Specifications):

'%05d'%123# => "00123"

Ifself contains multiple format specifications,object must be an array or hash containing the objects to be formatted:

'%-5s: %016x'% ['ID',self.object_id ]# => "ID   : 00002b054ec93168"'foo = %{foo}'% {foo:'bar'}# => "foo = bar"'foo = %{foo}, baz = %{baz}'% {foo:'bar',baz:'bat'}# => "foo = bar, baz = bat"

Related: seeConverting to New String.

self * n → new_string

Source

VALUErb_str_times(VALUE str, VALUE times){    VALUE str2;    long n, len;    char *ptr2;    int termlen;    if (times == INT2FIX(1)) {        return str_duplicate(rb_cString, str);    }    if (times == INT2FIX(0)) {        str2 = str_alloc_embed(rb_cString, 0);        rb_enc_copy(str2, str);        return str2;    }    len = NUM2LONG(times);    if (len < 0) {        rb_raise(rb_eArgError, "negative argument");    }    if (RSTRING_LEN(str) == 1 && RSTRING_PTR(str)[0] == 0) {        if (STR_EMBEDDABLE_P(len, 1)) {            str2 = str_alloc_embed(rb_cString, len + 1);            memset(RSTRING_PTR(str2), 0, len + 1);        }        else {            str2 = str_alloc_heap(rb_cString);            RSTRING(str2)->as.heap.aux.capa = len;            RSTRING(str2)->as.heap.ptr = ZALLOC_N(char, (size_t)len + 1);        }        STR_SET_LEN(str2, len);        rb_enc_copy(str2, str);        return str2;    }    if (len && LONG_MAX/len <  RSTRING_LEN(str)) {        rb_raise(rb_eArgError, "argument too big");    }    len *= RSTRING_LEN(str);    termlen = TERM_LEN(str);    str2 = str_enc_new(rb_cString, 0, len, STR_ENC_GET(str));    ptr2 = RSTRING_PTR(str2);    if (len) {        n = RSTRING_LEN(str);        memcpy(ptr2, RSTRING_PTR(str), n);        while (n <= len/2) {            memcpy(ptr2 + n, ptr2, n);            n *= 2;        }        memcpy(ptr2 + n, ptr2, len-n);    }    STR_SET_LEN(str2, len);    TERM_FILL(&ptr2[len], termlen);    rb_enc_cr_str_copy_for_substr(str2, str);    return str2;}

Returns a new string containingn copies ofself:

'Ho!'*3# => "Ho!Ho!Ho!"'No!'*0# => ""

Related: seeConverting to New String.

self + other_string → new_string

Source

VALUErb_str_plus(VALUE str1, VALUE str2){    VALUE str3;    rb_encoding *enc;    char *ptr1, *ptr2, *ptr3;    long len1, len2;    int termlen;    StringValue(str2);    enc = rb_enc_check_str(str1, str2);    RSTRING_GETMEM(str1, ptr1, len1);    RSTRING_GETMEM(str2, ptr2, len2);    termlen = rb_enc_mbminlen(enc);    if (len1 > LONG_MAX - len2) {        rb_raise(rb_eArgError, "string size too big");    }    str3 = str_enc_new(rb_cString, 0, len1+len2, enc);    ptr3 = RSTRING_PTR(str3);    memcpy(ptr3, ptr1, len1);    memcpy(ptr3+len1, ptr2, len2);    TERM_FILL(&ptr3[len1+len2], termlen);    ENCODING_CODERANGE_SET(str3, rb_enc_to_index(enc),                           ENC_CODERANGE_AND(ENC_CODERANGE(str1), ENC_CODERANGE(str2)));    RB_GC_GUARD(str1);    RB_GC_GUARD(str2);    return str3;}

Returns a new string containingother_string concatenated toself:

'Hello from '+self.to_s# => "Hello from main"

Related: seeConverting to New String.

+string → new_string or self

Source

static VALUEstr_uplus(VALUE str){    if (OBJ_FROZEN(str) || CHILLED_STRING_P(str)) {        return rb_str_dup(str);    }    else {        return str;    }}

Returnsself ifself is not frozen and can be mutated without warning issuance.

Otherwise returnsself.dup, which is not frozen.

Related: seeFreezing/Unfreezing.

-self → frozen_string

Source

static VALUEstr_uminus(VALUE str){    if (!BARE_STRING_P(str) && !rb_obj_frozen_p(str)) {        str = rb_str_dup(str);    }    return rb_fstring(str);}

Returns a frozen string equal toself.

The returned string isself if and only if all of the following are true:

self is already frozen.
self is an instance of String (rather than of a subclass of String)
self has no instance variables set on it.

Otherwise, the returned string is a frozen copy ofself.

Returningself, when possible, saves duplicatingself; seeData deduplication.

It may also save duplicating other, already-existing, strings:

s0 ='foo's1 ='foo's0.object_id==s1.object_id# => false(-s0).object_id== (-s1).object_id# => true

Note that method-@ is convenient for defining a constant:

FileName =-'config/database.yml'

While its aliasdedup is better suited for chaining:

'foo'.dedup.gsub!('o')

Related: seeFreezing/Unfreezing.

Also aliased as:dedup

self << object → self

Source

VALUErb_str_concat(VALUE str1, VALUE str2){    unsigned int code;    rb_encoding *enc = STR_ENC_GET(str1);    int encidx;    if (RB_INTEGER_TYPE_P(str2)) {        if (rb_num_to_uint(str2, &code) == 0) {        }        else if (FIXNUM_P(str2)) {            rb_raise(rb_eRangeError, "%ld out of char range", FIX2LONG(str2));        }        else {            rb_raise(rb_eRangeError, "bignum out of char range");        }    }    else {        return rb_str_append(str1, str2);    }    encidx = rb_ascii8bit_appendable_encoding_index(enc, code);    if (encidx >= 0) {        rb_str_buf_cat_byte(str1, (unsigned char)code);    }    else {        long pos = RSTRING_LEN(str1);        int cr = ENC_CODERANGE(str1);        int len;        char *buf;        switch (len = rb_enc_codelen(code, enc)) {          case ONIGERR_INVALID_CODE_POINT_VALUE:            rb_raise(rb_eRangeError, "invalid codepoint 0x%X in %s", code, rb_enc_name(enc));            break;          case ONIGERR_TOO_BIG_WIDE_CHAR_VALUE:          case 0:            rb_raise(rb_eRangeError, "%u out of char range", code);            break;        }        buf = ALLOCA_N(char, len + 1);        rb_enc_mbcput(code, buf, enc);        if (rb_enc_precise_mbclen(buf, buf + len + 1, enc) != len) {            rb_raise(rb_eRangeError, "invalid codepoint 0x%X in %s", code, rb_enc_name(enc));        }        rb_str_resize(str1, pos+len);        memcpy(RSTRING_PTR(str1) + pos, buf, len);        if (cr == ENC_CODERANGE_7BIT && code > 127) {            cr = ENC_CODERANGE_VALID;        }        else if (cr == ENC_CODERANGE_BROKEN) {            cr = ENC_CODERANGE_UNKNOWN;        }        ENC_CODERANGE_SET(str1, cr);    }    return str1;}

Appends a string representation ofobject toself; returnsself.

Ifobject is a string, appends it toself:

s ='foo's<<'bar'# => "foobar"s# => "foobar"

Ifobject is an integer, its value is considered a codepoint; converts the value to a character before concatenating:

s ='foo's<<33# => "foo!"

Additionally, if the codepoint is in range0..0xff and the encoding ofself is Encoding::US_ASCII, changes the encoding to Encoding::ASCII_8BIT:

s ='foo'.encode(Encoding::US_ASCII)s.encoding# => #<Encoding:US-ASCII>s<<0xff# => "foo\xFF"s.encoding# => #<Encoding:BINARY (ASCII-8BIT)>

RaisesRangeError if that codepoint is not representable in the encoding ofself:

s ='foo's.encoding# => <Encoding:UTF-8>s<<0x00110000# 1114112 out of char range (RangeError)s ='foo'.encode(Encoding::EUC_JP)s<<0x00800080# invalid codepoint 0x800080 in EUC-JP (RangeError)

Related: seeModifying.

self <=> other_string → -1, 0, 1, or nil

Source

static VALUErb_str_cmp_m(VALUE str1, VALUE str2){    int result;    VALUE s = rb_check_string_type(str2);    if (NIL_P(s)) {        return rb_invcmp(str1, str2);    }    result = rb_str_cmp(str1, s);    return INT2FIX(result);}

Comparesself andother_string, returning:

-1 ifother_string is larger.
0 if the two are equal.
1 ifother_string is smaller.
nil if the two are incomparable.

Examples:

'foo'<=>'foo'# => 0'foo'<=>'food'# => -1'food'<=>'foo'# => 1'FOO'<=>'foo'# => -1'foo'<=>'FOO'# => 1'foo'<=>1# => nil

Related: seeComparing.

self == object → true or false

Source

VALUErb_str_equal(VALUE str1, VALUE str2){    if (str1 == str2) return Qtrue;    if (!RB_TYPE_P(str2, T_STRING)) {        if (!rb_respond_to(str2, idTo_str)) {            return Qfalse;        }        return rb_equal(str2, str1);    }    return rb_str_eql_internal(str1, str2);}

Returns whetherobject is equal toself.

Whenobject is a string, returns whetherobject has the same length and content asself:

s ='foo's=='foo'# => trues=='food'# => falses=='FOO'# => false

Returnsfalse if the two strings’ encodings are not compatible:

"\u{e4 f6 fc}".encode(Encoding::ISO_8859_1)== ("\u{c4 d6 dc}")# => false

Whenobject is not a string:

Ifobject responds to methodto_str,object == self is called and its return value is returned.
Ifobject does not respond toto_str,false is returned.

Related:Comparing.

Also aliased as:===

===

Alias for:==

self =~ object → integer or nil

Source

static VALUErb_str_match(VALUE x, VALUE y){    switch (OBJ_BUILTIN_TYPE(y)) {      case T_STRING:        rb_raise(rb_eTypeError, "type mismatch: String given");      case T_REGEXP:        return rb_reg_match(y, x);      default:        return rb_funcall(y, idEqTilde, 1, x);    }}

Whenobject is aRegexp, returns the index of the first substring inself matched byobject, ornil if no match is found; updatesRegexp-related global variables:

'foo'=~/f/# => 0$~# => #<MatchData "f">'foo'=~/o/# => 1$~# => #<MatchData "o">'foo'=~/x/# => nil$~# => nil

Note thatstring =~ regexp is different fromregexp =~ string (seeRegexp#=~):

number =nil'no. 9'=~/(?<number>\d+)/# => 4number# => nil # Not assigned./(?<number>\d+)/=~'no. 9'# => 4number# => "9" # Assigned.

Ifobject is not aRegexp, returns the value returned byobject =~ self.

Related: seeQuerying.

self[index] → new_string or nil

self[start, length] → new_string or nil

self[range] → new_string or nil

self[regexp, capture = 0] → new_string or nil

self[substring] → new_string or nil

Source

static VALUErb_str_aref_m(int argc, VALUE *argv, VALUE str){    if (argc == 2) {        if (RB_TYPE_P(argv[0], T_REGEXP)) {            return rb_str_subpat(str, argv[0], argv[1]);        }        else {            return rb_str_substr_two_fixnums(str, argv[0], argv[1], TRUE);        }    }    rb_check_arity(argc, 1, 2);    return rb_str_aref(str, argv[0]);}

Returns the substring ofself specified by the arguments.

Formself[index]

With non-negative integer argumentindex given, returns the 1-character substring found in self at character offset index:

'hello'[0]# => "h"'hello'[4]# => "o"'hello'[5]# => nil'тест'[2]# => "с"'こんにちは'[4]# => "は"

With negative integer argumentindex given, counts backward from the end ofself:

'hello'[-1]# => "o"'hello'[-5]# => "h"'hello'[-6]# => nil

Formself[start, length]

With integer argumentsstart andlength given, returns a substring of sizelength characters (as available) beginning at character offset specified bystart.

If argumentstart is non-negative, the offset isstart:

'hello'[0,1]# => "h"'hello'[0,5]# => "hello"'hello'[0,6]# => "hello"'hello'[2,3]# => "llo"'hello'[2,0]# => ""'hello'[2,-1]# => nil

If argumentstart is negative, counts backward from the end ofself:

'hello'[-1,1]# => "o"'hello'[-5,5]# => "hello"'hello'[-1,0]# => ""'hello'[-6,5]# => nil

Special case: ifstart equals the length ofself, returns a new empty string:

'hello'[5,3]# => ""

Formself[range]

WithRange argumentrange given, forms substringself[range.start, range.size]:

'hello'[0..2]# => "hel"'hello'[0,3]# => "hel"'hello'[0...2]# => "he"'hello'[0,2]# => "he"'hello'[0,0]# => ""'hello'[0...0]# => ""

Formself[regexp, capture = 0]

WithRegexp argumentregexp given andcapture as zero, searches for a matching substring inself; updatesRegexp-related global variables:

'hello'[/ell/]# => "ell"'hello'[/l+/]# => "ll"'hello'[//]# => ""'hello'[/nosuch/]# => nil

Withcapture as a positive integern, returns the +n+th matched group:

'hello'[/(h)(e)(l+)(o)/]# => "hello"'hello'[/(h)(e)(l+)(o)/,1]# => "h"$1# => "h"'hello'[/(h)(e)(l+)(o)/,2]# => "e"$2# => "e"'hello'[/(h)(e)(l+)(o)/,3]# => "ll"'hello'[/(h)(e)(l+)(o)/,4]# => "o"'hello'[/(h)(e)(l+)(o)/,5]# => nil

Formself[substring]

With string argumentsubstring given, returns the matching substring ofself, if found:

'hello'['ell']# => "ell"'hello'['']# => ""'hello'['nosuch']# => nil'тест'['ес']# => "ес"'こんにちは'['んにち']# => "んにち"

Related: seeConverting to New String.

Also aliased as:slice

self[index] = other_string → new_string

self[start, length] = other_string → new_string

self[range] = other_string → new_string

self[regexp, capture = 0] = other_string → new_string

self[substring] = other_string → new_string

Source

static VALUErb_str_aset_m(int argc, VALUE *argv, VALUE str){    if (argc == 3) {        if (RB_TYPE_P(argv[0], T_REGEXP)) {            rb_str_subpat_set(str, argv[0], argv[1], argv[2]);        }        else {            rb_str_update(str, NUM2LONG(argv[0]), NUM2LONG(argv[1]), argv[2]);        }        return argv[2];    }    rb_check_arity(argc, 2, 3);    return rb_str_aset(str, argv[0], argv[1]);}

Returnsself with all, a substring, or none of its contents replaced; returns the argumentother_string.

Formself[index] = other_string

With non-negative integer argumentindex given, searches for the 1-character substring found in self at character offset index:

s ='hello's[0] ='foo'# => "foo"s# => "fooello"s ='hello's[4] ='foo'# => "foo"s# => "hellfoo"s ='hello's[5] ='foo'# => "foo"s# => "hellofoo"s ='hello's[6] ='foo'# Raises IndexError: index 6 out of string.

With negative integer argumentindex given, counts backward from the end ofself:

s ='hello's[-1] ='foo'# => "foo"s# => "hellfoo"s ='hello's[-5] ='foo'# => "foo"s# => "fooello"s ='hello's[-6] ='foo'# Raises IndexError: index -6 out of string.

Formself[start, length] = other_string

With integer argumentsstart andlength given, searches for a substring of sizelength characters (as available) beginning at character offset specified bystart.

If argumentstart is non-negative, the offset is +start’:

s ='hello's[0,1] ='foo'# => "foo"s# => "fooello"s ='hello's[0,5] ='foo'# => "foo"s# => "foo"s ='hello's[0,9] ='foo'# => "foo"s# => "foo"s ='hello's[2,0] ='foo'# => "foo"s# => "hefoollo"s ='hello's[2,-1] ='foo'# Raises IndexError: negative length -1.

If argumentstart is negative, counts backward from the end ofself:

s ='hello's[-1,1] ='foo'# => "foo"s# => "hellfoo"s ='hello's[-1,9] ='foo'# => "foo"s# => "hellfoo"s ='hello's[-5,2] ='foo'# => "foo"s# => "foollo"s ='hello's[-3,0] ='foo'# => "foo"s# => "hefoollo"s ='hello's[-6,2] ='foo'# Raises IndexError: index -6 out of string.

Special case: ifstart equals the length ofself, the argument is appended toself:

s ='hello's[5,3] ='foo'# => "foo"s# => "hellofoo"

Formself[range] = other_string

WithRange argumentrange given, equivalent toself[range.start, range.size] = other_string:

s0 ='hello's1 ='hello's0[0..2] ='foo'# => "foo"s1[0,3] ='foo'# => "foo"s0# => "foolo"s1# => "foolo"s ='hello's[0...2] ='foo'# => "foo"s# => "foollo"s ='hello's[0...0] ='foo'# => "foo"s# => "foohello"s ='hello's[9..10] ='foo'# Raises RangeError: 9..10 out of range

Formself[regexp, capture = 0] = other_string

WithRegexp argumentregexp given andcapture as zero, searches for a matching substring inself; updatesRegexp-related global variables:

s ='hello's[/l/] ='L'# => "L"[$`,$&,$']# => ["he", "l", "lo"]s[/eLlo/] ='owdy'# => "owdy"[$`,$&,$']# => ["h", "eLlo", ""]s[/eLlo/] ='owdy'# Raises IndexError: regexp not matched.[$`,$&,$']# => [nil, nil, nil]

Withcapture as a positive integern, searches for the +n+th matched group:

s = 'hello's[/(h)(e)(l+)(o)/] = 'foo'    # => "foo"[$`, $&, $']                  # => ["", "hello", ""]s = 'hello's[/(h)(e)(l+)(o)/, 1] = 'foo' # => "foo"s                             # => "fooello"[$`, $&, $']                  # => ["", "hello", ""]s = 'hello's[/(h)(e)(l+)(o)/, 2] = 'foo' # => "foo"s                             # => "hfoollo"[$`, $&, $']                  # => ["", "hello", ""]s = 'hello's[/(h)(e)(l+)(o)/, 4] = 'foo' # => "foo"s                             # => "hellfoo"[$`, $&, $']                  # => ["", "hello", ""]s = 'hello'# => "hello"s[/(h)(e)(l+)(o)/, 5] = 'foo  # Raises IndexError: index 5 out of regexp.s = 'hello's[/nosuch/] = 'foo'           # Raises IndexError: regexp not matched.

Formself[substring] = other_string

With string argumentsubstring given:

s ='hello's['l'] ='foo'# => "foo"s# => "hefoolo"s ='hello's['ll'] ='foo'# => "foo"s# => "hefooo"s ='тест's['ес'] ='foo'# => "foo"s# => "тfooт"s ='こんにちは's['んにち'] ='foo'# => "foo"s# => "こfooは"s['nosuch'] ='foo'# Raises IndexError: string not matched.

Related: seeModifying.

append_as_bytes(*objects) → self

Source

VALUErb_str_append_as_bytes(int argc, VALUE *argv, VALUE str){    long needed_capacity = 0;    volatile VALUE t0;    enum ruby_value_type *types = ALLOCV_N(enum ruby_value_type, t0, argc);    for (int index = 0; index < argc; index++) {        VALUE obj = argv[index];        enum ruby_value_type type = types[index] = rb_type(obj);        switch (type) {          case T_FIXNUM:          case T_BIGNUM:            needed_capacity++;            break;          case T_STRING:            needed_capacity += RSTRING_LEN(obj);            break;          default:            rb_raise(                rb_eTypeError,                "wrong argument type %"PRIsVALUE" (expected String or Integer)",                rb_obj_class(obj)            );            break;        }    }    str_ensure_available_capa(str, needed_capacity);    char *sptr = RSTRING_END(str);    for (int index = 0; index < argc; index++) {        VALUE obj = argv[index];        enum ruby_value_type type = types[index];        switch (type) {          case T_FIXNUM:          case T_BIGNUM: {            argv[index] = obj = rb_int_and(obj, INT2FIX(0xff));            char byte = (char)(NUM2INT(obj) & 0xFF);            *sptr = byte;            sptr++;            break;          }          case T_STRING: {            const char *ptr;            long len;            RSTRING_GETMEM(obj, ptr, len);            memcpy(sptr, ptr, len);            sptr += len;            break;          }          default:            rb_bug("append_as_bytes arguments should have been validated");        }    }    STR_SET_LEN(str, RSTRING_LEN(str) + needed_capacity);    TERM_FILL(sptr, TERM_LEN(str)); /* sentinel */    int cr = ENC_CODERANGE(str);    switch (cr) {      case ENC_CODERANGE_7BIT: {        for (int index = 0; index < argc; index++) {            VALUE obj = argv[index];            enum ruby_value_type type = types[index];            switch (type) {              case T_FIXNUM:              case T_BIGNUM: {                if (!ISASCII(NUM2INT(obj))) {                    goto clear_cr;                }                break;              }              case T_STRING: {                if (ENC_CODERANGE(obj) != ENC_CODERANGE_7BIT) {                    goto clear_cr;                }                break;              }              default:                rb_bug("append_as_bytes arguments should have been validated");            }        }        break;      }      case ENC_CODERANGE_VALID:        if (ENCODING_GET_INLINED(str) == ENCINDEX_ASCII_8BIT) {            goto keep_cr;        }        else {            goto clear_cr;        }        break;      default:        goto clear_cr;        break;    }    RB_GC_GUARD(t0);  clear_cr:    // If no fast path was hit, we clear the coderange.    // append_as_bytes is predominently meant to be used in    // buffering situation, hence it's likely the coderange    // will never be scanned, so it's not worth spending time    // precomputing the coderange except for simple and common    // situations.    ENC_CODERANGE_CLEAR(str);  keep_cr:    return str;}

Concatenates each object inobjects intoself; returnsself; performs no encoding validation or conversion:

s ='foo's.append_as_bytes(" \xE2\x82")# => "foo \xE2\x82"s.valid_encoding?# => falses.append_as_bytes("\xAC 12")s.valid_encoding?# => true

When a given object is an integer, the value is considered an 8-bit byte; if the integer occupies more than one byte (i.e,. is greater than 255), appends only the low-order byte (similar toString#setbyte):

s =""s.append_as_bytes(0,257)# => "\u0000\u0001"s.bytesize# => 2

Related: seeModifying.

ascii_only? → true or false

Source

static VALUErb_str_is_ascii_only_p(VALUE str){    int cr = rb_enc_str_coderange(str);    return RBOOL(cr == ENC_CODERANGE_7BIT);}

Returns whetherself contains only ASCII characters:

'abc'.ascii_only?# => true"abc\u{6666}".ascii_only?# => false

Related: seeQuerying.

b → new_string

Source

static VALUErb_str_b(VALUE str){    VALUE str2;    if (STR_EMBED_P(str)) {        str2 = str_alloc_embed(rb_cString, RSTRING_LEN(str) + TERM_LEN(str));    }    else {        str2 = str_alloc_heap(rb_cString);    }    str_replace_shared_without_enc(str2, str);    if (rb_enc_asciicompat(STR_ENC_GET(str))) {        // BINARY strings can never be broken; they're either 7-bit ASCII or VALID.        // If we know the receiver's code range then we know the result's code range.        int cr = ENC_CODERANGE(str);        switch (cr) {          case ENC_CODERANGE_7BIT:            ENC_CODERANGE_SET(str2, ENC_CODERANGE_7BIT);            break;          case ENC_CODERANGE_BROKEN:          case ENC_CODERANGE_VALID:            ENC_CODERANGE_SET(str2, ENC_CODERANGE_VALID);            break;          default:            ENC_CODERANGE_CLEAR(str2);            break;        }    }    return str2;}

Returns a copy ofself that has ASCII-8BIT encoding; the underlying bytes are not modified:

s ="\x99"s.encoding# => #<Encoding:UTF-8>t =s.b# => "\x99"t.encoding# => #<Encoding:ASCII-8BIT>s ="\u4095"# => "䂕"s.encoding# => #<Encoding:UTF-8>s.bytes# => [228, 130, 149]t =s.b# => "\xE4\x82\x95"t.encoding# => #<Encoding:ASCII-8BIT>t.bytes# => [228, 130, 149]

Related: seeConverting to New String.

byteindex(object, offset = 0) → integer or nil

Source

static VALUErb_str_byteindex_m(int argc, VALUE *argv, VALUE str){    VALUE sub;    VALUE initpos;    long pos;    if (rb_scan_args(argc, argv, "11", &sub, &initpos) == 2) {        long slen = RSTRING_LEN(str);        pos = NUM2LONG(initpos);        if (pos < 0 ? (pos += slen) < 0 : pos > slen) {            if (RB_TYPE_P(sub, T_REGEXP)) {                rb_backref_set(Qnil);            }            return Qnil;        }    }    else {        pos = 0;    }    str_ensure_byte_pos(str, pos);    if (RB_TYPE_P(sub, T_REGEXP)) {        if (rb_reg_search(sub, str, pos, 0) >= 0) {            VALUE match = rb_backref_get();            struct re_registers *regs = RMATCH_REGS(match);            pos = BEG(0);            return LONG2NUM(pos);        }    }    else {        StringValue(sub);        pos = rb_str_byteindex(str, sub, pos);        if (pos >= 0) return LONG2NUM(pos);    }    return Qnil;}

Returns the 0-based integer index of a substring ofself specified byobject (a string orRegexp) andoffset, ornil if there is no such substring; the returned index is the count ofbytes (not characters).

Whenobject is a string, returns the index of the first found substring equal toobject:

s ='foo'# => "foo"s.size# => 3 # Three 1-byte characters.s.bytesize# => 3 # Three bytes.s.byteindex('f')# => 0s.byteindex('o')# => 1s.byteindex('oo')# => 1s.byteindex('ooo')# => nil

Whenobject is aRegexp, returns the index of the first found substring matchingobject; updatesRegexp-related global variables:

s ='foo's.byteindex(/f/)# => 0$~# => #<MatchData "f">s.byteindex(/o/)# => 1s.byteindex(/oo/)# => 1s.byteindex(/ooo/)# => nil$~# => nil

Integer argumentoffset, if given, specifies the 0-based index of the byte where searching is to begin.

Whenoffset is non-negative, searching begins at byte positionoffset:

s ='foo's.byteindex('o',1)# => 1s.byteindex('o',2)# => 2s.byteindex('o',3)# => nil

Whenoffset is negative, counts backward from the end ofself:

s ='foo's.byteindex('o',-1)# => 2s.byteindex('o',-2)# => 1s.byteindex('o',-3)# => 1s.byteindex('o',-4)# => nil

RaisesIndexError if the byte atoffset is not the first byte of a character:

s ="\uFFFF\uFFFF"# => "\uFFFF\uFFFF"s.size# => 2 # Two 3-byte characters.s.bytesize# => 6 # Six bytes.s.byteindex("\uFFFF")# => 0s.byteindex("\uFFFF",1)# Raises IndexErrors.byteindex("\uFFFF",2)# Raises IndexErrors.byteindex("\uFFFF",3)# => 3s.byteindex("\uFFFF",4)# Raises IndexErrors.byteindex("\uFFFF",5)# Raises IndexErrors.byteindex("\uFFFF",6)# => nil

Related: seeQuerying.

byterindex(object, offset = self.bytesize) → integer or nil

Source

static VALUErb_str_byterindex_m(int argc, VALUE *argv, VALUE str){    VALUE sub;    VALUE initpos;    long pos, len = RSTRING_LEN(str);    if (rb_scan_args(argc, argv, "11", &sub, &initpos) == 2) {        pos = NUM2LONG(initpos);        if (pos < 0 && (pos += len) < 0) {            if (RB_TYPE_P(sub, T_REGEXP)) {                rb_backref_set(Qnil);            }            return Qnil;        }        if (pos > len) pos = len;    }    else {        pos = len;    }    str_ensure_byte_pos(str, pos);    if (RB_TYPE_P(sub, T_REGEXP)) {        if (rb_reg_search(sub, str, pos, 1) >= 0) {            VALUE match = rb_backref_get();            struct re_registers *regs = RMATCH_REGS(match);            pos = BEG(0);            return LONG2NUM(pos);        }    }    else {        StringValue(sub);        pos = rb_str_byterindex(str, sub, pos);        if (pos >= 0) return LONG2NUM(pos);    }    return Qnil;}

Returns the 0-based integer index of a substring ofself that is thelast match for the givenobject (a string orRegexp) andoffset, ornil if there is no such substring; the returned index is the count ofbytes (not characters).

Whenobject is a string, returns the index of thelast found substring equal toobject:

s ='foo'# => "foo"s.size# => 3 # Three 1-byte characters.s.bytesize# => 3 # Three bytes.s.byterindex('f')# => 0s.byterindex('o')# => 2s.byterindex('oo')# => 1s.byterindex('ooo')# => nil

Whenobject is aRegexp, returns the index of the last found substring matchingobject; updatesRegexp-related global variables:

s ='foo's.byterindex(/f/)# => 0$~# => #<MatchData "f">s.byterindex(/o/)# => 2s.byterindex(/oo/)# => 1s.byterindex(/ooo/)# => nil$~# => nil

The last match means starting at the possible last position, not the last of the longest matches:

s ='foo's.byterindex(/o+/)# => 2$~#=> #<MatchData "o">

To get the last longest match, use a negative lookbehind:

s ='foo's.byterindex(/(?<!o)o+/)# => 1$~# => #<MatchData "oo">

Or use methodbyteindex with negative lookahead:

s ='foo's.byteindex(/o+(?!.*o)/)# => 1$~#=> #<MatchData "oo">

Integer argumentoffset, if given, specifies the 0-based index of the byte where searching is to end.

Whenoffset is non-negative, searching ends at byte positionoffset:

s ='foo's.byterindex('o',0)# => nils.byterindex('o',1)# => 1s.byterindex('o',2)# => 2s.byterindex('o',3)# => 2

Whenoffset is negative, counts backward from the end ofself:

s ='foo's.byterindex('o',-1)# => 2s.byterindex('o',-2)# => 1s.byterindex('o',-3)# => nil

RaisesIndexError if the byte atoffset is not the first byte of a character:

s ="\uFFFF\uFFFF"# => "\uFFFF\uFFFF"s.size# => 2 # Two 3-byte characters.s.bytesize# => 6 # Six bytes.s.byterindex("\uFFFF")# => 3s.byterindex("\uFFFF",1)# Raises IndexErrors.byterindex("\uFFFF",2)# Raises IndexErrors.byterindex("\uFFFF",3)# => 3s.byterindex("\uFFFF",4)# Raises IndexErrors.byterindex("\uFFFF",5)# Raises IndexErrors.byterindex("\uFFFF",6)# => nil

Related: seeQuerying.

bytes → array_of_bytes

Source

static VALUErb_str_bytes(VALUE str){    VALUE ary = WANTARRAY("bytes", RSTRING_LEN(str));    return rb_str_enumerate_bytes(str, ary);}

Returns an array of the bytes inself:

'hello'.bytes# => [104, 101, 108, 108, 111]'тест'.bytes# => [209, 130, 208, 181, 209, 129, 209, 130]'こんにちは'.bytes# => [227, 129, 147, 227, 130, 147, 227, 129, 171, 227, 129, 161, 227, 129, 175]

Related: seeConverting to Non-String.

bytesize → integer

Source

VALUErb_str_bytesize(VALUE str){    return LONG2NUM(RSTRING_LEN(str));}

Returns the count of bytes inself.

Note that the byte count may be different from the character count (returned bysize):

s ='foo's.bytesize# => 3s.size# => 3s ='тест's.bytesize# => 8s.size# => 4s ='こんにちは's.bytesize# => 15s.size# => 5

Related: seeQuerying.

byteslice(offset, length = 1) → string or nil

byteslice(range) → string or nil

Source

static VALUErb_str_byteslice(int argc, VALUE *argv, VALUE str){    if (argc == 2) {        long beg = NUM2LONG(argv[0]);        long len = NUM2LONG(argv[1]);        return str_byte_substr(str, beg, len, TRUE);    }    rb_check_arity(argc, 1, 2);    return str_byte_aref(str, argv[0]);}

Returns a substring ofself, ornil if the substring cannot be constructed.

With integer argumentsoffset andlength given, returns the substring beginning at the givenoffset and of the givenlength (as available):

s ='0123456789'# => "0123456789"s.byteslice(2)# => "2"s.byteslice(200)# => nils.byteslice(4,3)# => "456"s.byteslice(4,30)# => "456789"

Returnsnil iflength is negative oroffset falls outside ofself:

s.byteslice(4,-1)# => nils.byteslice(40,2)# => nil

Counts backwards from the end ofself ifoffset is negative:

s ='0123456789'# => "0123456789"s.byteslice(-4)# => "6"s.byteslice(-4,3)# => "678"

WithRange argumentrange given, returnsbyteslice(range.begin, range.size):

s ='0123456789'# => "0123456789"s.byteslice(4..6)# => "456"s.byteslice(-6..-4)# => "456"s.byteslice(5..2)# => "" # range.size is zero.s.byteslice(40..42)# => nil

The starting and ending offsets need not be on character boundaries:

s ='こんにちは's.byteslice(0,3)# => "こ"s.byteslice(1,3)# => "\x81\x93\xE3"

The encodings ofself and the returned substring are always the same:

s.encoding# => #<Encoding:UTF-8>s.byteslice(0,3).encoding# => #<Encoding:UTF-8>s.byteslice(1,3).encoding# => #<Encoding:UTF-8>

But, depending on the character boundaries, the encoding of the returned substring may not be valid:

s.valid_encoding?# => trues.byteslice(0,3).valid_encoding?# => trues.byteslice(1,3).valid_encoding?# => false

Related: seeConverting to New String.

bytesplice(offset, length, str) → self

bytesplice(offset, length, str, str_offset, str_length) → self

bytesplice(range, str) → self

bytesplice(range, str, str_range) → self

Source

static VALUErb_str_bytesplice(int argc, VALUE *argv, VALUE str){    long beg, len, vbeg, vlen;    VALUE val;    int cr;    rb_check_arity(argc, 2, 5);    if (!(argc == 2 || argc == 3 || argc == 5)) {        rb_raise(rb_eArgError, "wrong number of arguments (given %d, expected 2, 3, or 5)", argc);    }    if (argc == 2 || (argc == 3 && !RB_INTEGER_TYPE_P(argv[0]))) {        if (!rb_range_beg_len(argv[0], &beg, &len, RSTRING_LEN(str), 2)) {            rb_raise(rb_eTypeError, "wrong argument type %s (expected Range)",                     rb_builtin_class_name(argv[0]));        }        val = argv[1];        StringValue(val);        if (argc == 2) {            /* bytesplice(range, str) */            vbeg = 0;            vlen = RSTRING_LEN(val);        }        else {            /* bytesplice(range, str, str_range) */            if (!rb_range_beg_len(argv[2], &vbeg, &vlen, RSTRING_LEN(val), 2)) {                rb_raise(rb_eTypeError, "wrong argument type %s (expected Range)",                         rb_builtin_class_name(argv[2]));            }        }    }    else {        beg = NUM2LONG(argv[0]);        len = NUM2LONG(argv[1]);        val = argv[2];        StringValue(val);        if (argc == 3) {            /* bytesplice(index, length, str) */            vbeg = 0;            vlen = RSTRING_LEN(val);        }        else {            /* bytesplice(index, length, str, str_index, str_length) */            vbeg = NUM2LONG(argv[3]);            vlen = NUM2LONG(argv[4]);        }    }    str_check_beg_len(str, &beg, &len);    str_check_beg_len(val, &vbeg, &vlen);    str_modify_keep_cr(str);    if (RB_UNLIKELY(ENCODING_GET_INLINED(str) != ENCODING_GET_INLINED(val))) {        rb_enc_associate(str, rb_enc_check(str, val));    }    rb_str_update_1(str, beg, len, val, vbeg, vlen);    cr = ENC_CODERANGE_AND(ENC_CODERANGE(str), ENC_CODERANGE(val));    if (cr != ENC_CODERANGE_BROKEN)        ENC_CODERANGE_SET(str, cr);    return str;}

Replacestarget bytes inself withsource bytes from the given stringstr; returnsself.

In the first form, argumentsoffset andlength determine the target bytes, and the source bytes are all of the givenstr:

'0123456789'.bytesplice(0,3,'abc')# => "abc3456789"'0123456789'.bytesplice(3,3,'abc')# => "012abc6789"'0123456789'.bytesplice(0,50,'abc')# => "abc"'0123456789'.bytesplice(50,3,'abc')# Raises IndexError.

The counts of the target bytes and source source bytes may be different:

'0123456789'.bytesplice(0,6,'abc')# => "abc6789"      # Shorter source.'0123456789'.bytesplice(0,1,'abc')# => "abc123456789" # Shorter target.

And either count may be zero (i.e., specifying an empty string):

'0123456789'.bytesplice(0,3,'')# => "3456789"       # Empty source.'0123456789'.bytesplice(0,0,'abc')# => "abc0123456789" # Empty target.

In the second form, just as in the first, arugmentsoffset andlength determine the target bytes; argumentstrcontains the source bytes, and the additional argumentsstr_offset andstr_length determine the actual source bytes:

'0123456789'.bytesplice(0,3,'abc',0,3)# => "abc3456789"'0123456789'.bytesplice(0,3,'abc',1,1)# => "b3456789"      # Shorter source.'0123456789'.bytesplice(0,1,'abc',0,3)# => "abc123456789"  # Shorter target.'0123456789'.bytesplice(0,3,'abc',1,0)# => "3456789"       # Empty source.'0123456789'.bytesplice(0,0,'abc',0,3)# => "abc0123456789" # Empty target.

In the third form, argumentrange determines the target bytes and the source bytes are all of the givenstr:

'0123456789'.bytesplice(0..2,'abc')# => "abc3456789"'0123456789'.bytesplice(3..5,'abc')# => "012abc6789"'0123456789'.bytesplice(0..5,'abc')# => "abc6789"       # Shorter source.'0123456789'.bytesplice(0..0,'abc')# => "abc123456789"  # Shorter target.'0123456789'.bytesplice(0..2,'')# => "3456789"       # Empty source.'0123456789'.bytesplice(0...0,'abc')# => "abc0123456789" # Empty target.

In the fourth form, just as in the third, arugmentrange determines the target bytes; argumentstrcontains the source bytes, and the additional argumentstr_range determines the actual source bytes:

'0123456789'.bytesplice(0..2,'abc',0..2)# => "abc3456789"'0123456789'.bytesplice(3..5,'abc',0..2)# => "012abc6789"'0123456789'.bytesplice(0..2,'abc',0..1)# => "ab3456789"     # Shorter source.'0123456789'.bytesplice(0..1,'abc',0..2)# => "abc23456789"   # Shorter target.'0123456789'.bytesplice(0..2,'abc',0...0)# => "3456789"       # Empty source.'0123456789'.bytesplice(0...0,'abc',0..2)# => "abc0123456789" # Empty target.

In any of the forms, the beginnings and endings of both source and target must be on character boundaries.

In these examples,self has five 3-byte characters, and so has character boundaries at offsets 0, 3, 6, 9, 12, and 15.

'こんにちは'.bytesplice(0,3,'abc')# => "abcんにちは"'こんにちは'.bytesplice(1,3,'abc')# Raises IndexError.'こんにちは'.bytesplice(0,2,'abc')# Raises IndexError.

capitalize(mapping = :ascii) → string

Source

static VALUErb_str_capitalize(int argc, VALUE *argv, VALUE str){    rb_encoding *enc;    OnigCaseFoldType flags = ONIGENC_CASE_UPCASE | ONIGENC_CASE_TITLECASE;    VALUE ret;    flags = check_case_options(argc, argv, flags);    enc = str_true_enc(str);    if (RSTRING_LEN(str) == 0 || !RSTRING_PTR(str)) return str;    if (flags&ONIGENC_CASE_ASCII_ONLY) {        ret = rb_str_new(0, RSTRING_LEN(str));        rb_str_ascii_casemap(str, ret, &flags, enc);    }    else {        ret = rb_str_casemap(str, &flags, enc);    }    return ret;}

Returns a string containing the characters inself, each with possibly changed case:

The first character is upcased.
All other characters are downcased.

Examples:

'hello world'.capitalize# => "Hello world"'HELLO WORLD'.capitalize# => "Hello world"

Some characters do not have upcase and downcase, and so are not changed; seeCase Mapping:

'1, 2, 3, ...'.capitalize# => "1, 2, 3, ..."

The casing is affected by the givenmapping, which may be:ascii,:fold, or:turkic; seeCase Mappings.

Related: seeConverting to New String.

capitalize!(mapping = :ascii) → self or nil

Source

static VALUErb_str_capitalize_bang(int argc, VALUE *argv, VALUE str){    rb_encoding *enc;    OnigCaseFoldType flags = ONIGENC_CASE_UPCASE | ONIGENC_CASE_TITLECASE;    flags = check_case_options(argc, argv, flags);    str_modify_keep_cr(str);    enc = str_true_enc(str);    if (RSTRING_LEN(str) == 0 || !RSTRING_PTR(str)) return Qnil;    if (flags&ONIGENC_CASE_ASCII_ONLY)        rb_str_ascii_casemap(str, str, &flags, enc);    else        str_shared_replace(str, rb_str_casemap(str, &flags, enc));    if (ONIGENC_CASE_MODIFIED&flags) return str;    return Qnil;}

LikeString#capitalize, except that:

Changes character casings inself (not in a copy ofself).
Returnsself if any changes are made,nil otherwise.

Related: SeeModifying.

casecmp(other_string) → -1, 0, 1, or nil

Source

static VALUErb_str_casecmp(VALUE str1, VALUE str2){    VALUE s = rb_check_string_type(str2);    if (NIL_P(s)) {        return Qnil;    }    return str_casecmp(str1, s);}

Ignoring case, comparesself andother_string; returns:

-1 ifself.downcase is smaller thanother_string.downcase.
0 if the two are equal.
1 ifself.downcase is larger thanother_string.downcase.
nil if the two are incomparable.

SeeCase Mapping.

Examples:

'foo'.casecmp('goo')# => -1'goo'.casecmp('foo')# => 1'foo'.casecmp('food')# => -1'food'.casecmp('foo')# => 1'FOO'.casecmp('foo')# => 0'foo'.casecmp('FOO')# => 0'foo'.casecmp(1)# => nil

Related: seeComparing.

casecmp?(other_string) → true, false, or nil

Source

static VALUErb_str_casecmp_p(VALUE str1, VALUE str2){    VALUE s = rb_check_string_type(str2);    if (NIL_P(s)) {        return Qnil;    }    return str_casecmp_p(str1, s);}

Returnstrue ifself andother_string are equal after Unicode case folding,false if unequal,nil if incomparable.

SeeCase Mapping.

Examples:

'foo'.casecmp?('goo')# => false'goo'.casecmp?('foo')# => false'foo'.casecmp?('food')# => false'food'.casecmp?('foo')# => false'FOO'.casecmp?('foo')# => true'foo'.casecmp?('FOO')# => true'foo'.casecmp?(1)# => nil

Related: seeComparing.

center(size, pad_string = ' ') → new_string

Source

static VALUErb_str_center(int argc, VALUE *argv, VALUE str){    return rb_str_justify(argc, argv, str, 'c');}

Returns a centered copy ofself.

If integer argumentsize is greater than the size (in characters) ofself, returns a new string of lengthsize that is a copy ofself, centered and padded on one or both ends withpad_string:

'hello'.center(6)# => "hello "               # Padded on one end.'hello'.center(10)# => "  hello   "           # Padded on both ends.'hello'.center(20,'-|')# => "-|-|-|-hello-|-|-|-|" # Some padding repeated.'hello'.center(10,'abcdefg')# => "abhelloabc"           # Some padding not used.'  hello  '.center(13)# => "    hello    "'тест'.center(10)# => "   тест   "'こんにちは'.center(10)# => "  こんにちは   "      # Multi-byte characters.

Ifsize is less than or equal to the size ofself, returns an unpadded copy ofself:

'hello'.center(5)# => "hello"'hello'.center(-10)# => "hello"

Related: seeConverting to New String.

chars → array_of_characters

Source

static VALUErb_str_chars(VALUE str){    VALUE ary = WANTARRAY("chars", rb_str_strlen(str));    return rb_str_enumerate_chars(str, ary);}

Returns an array of the characters inself:

'hello'.chars# => ["h", "e", "l", "l", "o"]'тест'.chars# => ["т", "е", "с", "т"]'こんにちは'.chars# => ["こ", "ん", "に", "ち", "は"]''.chars# => []

Related: seeConverting to Non-String.

chomp(line_sep = $/) → new_string

Source

static VALUErb_str_chomp(int argc, VALUE *argv, VALUE str){    VALUE rs = chomp_rs(argc, argv);    if (NIL_P(rs)) return str_duplicate(rb_cString, str);    return rb_str_subseq(str, 0, chompped_length(str, rs));}

Returns a new string copied fromself, with trailing characters possibly removed:

Whenline_sep is"\n", removes the last one or two characters if they are"\r","\n", or"\r\n" (but not"\n\r"):

$/# => "\n""abc\r".chomp# => "abc""abc\n".chomp# => "abc""abc\r\n".chomp# => "abc""abc\n\r".chomp# => "abc\n""тест\r\n".chomp# => "тест""こんにちは\r\n".chomp# => "こんにちは"

Whenline_sep is'' (an empty string), removes multiple trailing occurrences of"\n" or"\r\n" (but not"\r" or"\n\r"):

"abc\n\n\n".chomp('')# => "abc""abc\r\n\r\n\r\n".chomp('')# => "abc""abc\n\n\r\n\r\n\n\n".chomp('')# => "abc""abc\n\r\n\r\n\r".chomp('')# => "abc\n\r\n\r\n\r""abc\r\r\r".chomp('')# => "abc\r\r\r"

Whenline_sep is neither"\n" nor'', removes a single trailing line separator if there is one:

'abcd'.chomp('cd')# => "ab"'abcdcd'.chomp('cd')# => "abcd"'abcd'.chomp('xx')# => "abcd"

Related: seeConverting to New String.

chomp!(line_sep = $/) → self or nil

Source

static VALUErb_str_chomp_bang(int argc, VALUE *argv, VALUE str){    VALUE rs;    str_modifiable(str);    if (RSTRING_LEN(str) == 0 && argc < 2) return Qnil;    rs = chomp_rs(argc, argv);    if (NIL_P(rs)) return Qnil;    return rb_str_chomp_string(str, rs);}

LikeString#chomp, except that:

Removes trailing characters fromself (not from a copy ofself).
Returnsself if any characters are removed,nil otherwise.

Related: seeModifying.

chop → new_string

Source

static VALUErb_str_chop(VALUE str){    return rb_str_subseq(str, 0, chopped_length(str));}

Returns a new string copied fromself, with trailing characters possibly removed.

Removes"\r\n" if those are the last two characters.

"abc\r\n".chop# => "abc""тест\r\n".chop# => "тест""こんにちは\r\n".chop# => "こんにちは"

Otherwise removes the last character if it exists.

'abcd'.chop# => "abc"'тест'.chop# => "тес"'こんにちは'.chop# => "こんにち"''.chop# => ""

If you only need to remove the newline separator at the end of the string,String#chomp is a better alternative.

Related: seeConverting to New String.

chop! → self or nil

Source

static VALUErb_str_chop_bang(VALUE str){    str_modify_keep_cr(str);    if (RSTRING_LEN(str) > 0) {        long len;        len = chopped_length(str);        STR_SET_LEN(str, len);        TERM_FILL(&RSTRING_PTR(str)[len], TERM_LEN(str));        if (ENC_CODERANGE(str) != ENC_CODERANGE_7BIT) {            ENC_CODERANGE_CLEAR(str);        }        return str;    }    return Qnil;}

LikeString#chop, except that:

Removes trailing characters fromself (not from a copy ofself).
Returnsself if any characters are removed,nil otherwise.

Related: seeModifying.

chr → string

Source

static VALUErb_str_chr(VALUE str){    return rb_str_substr(str, 0, 1);}

Returns a string containing the first character ofself:

'hello'.chr# => "h"'тест'.chr# => "т"'こんにちは'.chr# => "こ"''.chr# => ""

Related: seeConverting to New String.

clear → self

Source

static VALUErb_str_clear(VALUE str){    str_discard(str);    STR_SET_EMBED(str);    STR_SET_LEN(str, 0);    RSTRING_PTR(str)[0] = 0;    if (rb_enc_asciicompat(STR_ENC_GET(str)))        ENC_CODERANGE_SET(str, ENC_CODERANGE_7BIT);    else        ENC_CODERANGE_SET(str, ENC_CODERANGE_VALID);    return str;}

Removes the contents ofself:

s ='foo's.clear# => ""s# => ""

Related: seeModifying.

codepoints → array_of_integers

Source

static VALUErb_str_codepoints(VALUE str){    VALUE ary = WANTARRAY("codepoints", rb_str_strlen(str));    return rb_str_enumerate_codepoints(str, ary);}

Returns an array of the codepoints inself; each codepoint is the integer value for a character:

'hello'.codepoints# => [104, 101, 108, 108, 111]'тест'.codepoints# => [1090, 1077, 1089, 1090]'こんにちは'.codepoints# => [12371, 12435, 12395, 12385, 12399]''.codepoints# => []

Related: seeConverting to Non-String.

concat(*objects) → string

Source

static VALUErb_str_concat_multi(int argc, VALUE *argv, VALUE str){    str_modifiable(str);    if (argc == 1) {        return rb_str_concat(str, argv[0]);    }    else if (argc > 1) {        int i;        VALUE arg_str = rb_str_tmp_new(0);        rb_enc_copy(arg_str, str);        for (i = 0; i < argc; i++) {            rb_str_concat(arg_str, argv[i]);        }        rb_str_buf_append(str, arg_str);    }    return str;}

Concatenates each object inobjects toself; returnsself:

'foo'.concat('bar','baz')# => "foobarbaz"

For each given objectobject that is an integer, the value is considered a codepoint and converted to a character before concatenation:

'foo'.concat(32,'bar',32,'baz')# => "foo bar baz" # Embeds spaces.'те'.concat(1089,1090)# => "тест"'こん'.concat(12395,12385,12399)# => "こんにちは"

Related: seeConverting to New String.

count(*selectors) → integer

Source

static VALUErb_str_count(int argc, VALUE *argv, VALUE str){    char table[TR_TABLE_SIZE];    rb_encoding *enc = 0;    VALUE del = 0, nodel = 0, tstr;    char *s, *send;    int i;    int ascompat;    size_t n = 0;    rb_check_arity(argc, 1, UNLIMITED_ARGUMENTS);    tstr = argv[0];    StringValue(tstr);    enc = rb_enc_check(str, tstr);    if (argc == 1) {        const char *ptstr;        if (RSTRING_LEN(tstr) == 1 && rb_enc_asciicompat(enc) &&            (ptstr = RSTRING_PTR(tstr),             ONIGENC_IS_ALLOWED_REVERSE_MATCH(enc, (const unsigned char *)ptstr, (const unsigned char *)ptstr+1)) &&            !is_broken_string(str)) {            int clen;            unsigned char c = rb_enc_codepoint_len(ptstr, ptstr+1, &clen, enc);            s = RSTRING_PTR(str);            if (!s || RSTRING_LEN(str) == 0) return INT2FIX(0);            send = RSTRING_END(str);            while (s < send) {                if (*(unsigned char*)s++ == c) n++;            }            return SIZET2NUM(n);        }    }    tr_setup_table(tstr, table, TRUE, &del, &nodel, enc);    for (i=1; i<argc; i++) {        tstr = argv[i];        StringValue(tstr);        enc = rb_enc_check(str, tstr);        tr_setup_table(tstr, table, FALSE, &del, &nodel, enc);    }    s = RSTRING_PTR(str);    if (!s || RSTRING_LEN(str) == 0) return INT2FIX(0);    send = RSTRING_END(str);    ascompat = rb_enc_asciicompat(enc);    while (s < send) {        unsigned int c;        if (ascompat && (c = *(unsigned char*)s) < 0x80) {            if (table[c]) {                n++;            }            s++;        }        else {            int clen;            c = rb_enc_codepoint_len(s, send, &clen, enc);            if (tr_find(c, table, del, nodel)) {                n++;            }            s += clen;        }    }    return SIZET2NUM(n);}

Returns the total number of characters inself that are specified by the given selectors.

For one 1-character selector, returns the count of instances of that character:

s ='abracadabra's.count('a')# => 5s.count('b')# => 2s.count('x')# => 0s.count('')# => 0s ='тест's.count('т')# => 2s.count('е')# => 1s ='よろしくお願いします's.count('よ')# => 1s.count('し')# => 2

For one multi-character selector, returns the count of instances for all specified characters:

s ='abracadabra's.count('ab')# => 7s.count('abc')# => 8s.count('abcd')# => 9s.count('abcdr')# => 11s.count('abcdrx')# => 11

Order and repetition do not matter:

s.count('ba')==s.count('ab')# => trues.count('baab')==s.count('ab')# => true

For multiple selectors, forms a single selector that is the intersection of characters in all selectors and returns the count of instances for that selector:

s ='abcdefg's.count('abcde','dcbfg')==s.count('bcd')# => trues.count('abc','def')==s.count('')# => true

In a character selector, three characters get special treatment:

A caret ('^') functions as anegation operator for the immediately following characters:
```
s ='abracadabra's.count('^bc')# => 8  # Count of all except 'b' and 'c'.
```

A hyphen ('-') between two other characters defines arange of characters:

s ='abracadabra's.count('a-c')# => 8  # Count of all 'a', 'b', and 'c'.

A backslash ('\') acts as an escape for a caret, a hyphen, or another backslash:

s ='abracadabra's.count('\^bc')# => 3  # Count of '^', 'b', and 'c'.s.count('a\-c')# => 6  # Count of 'a', '-', and 'c'.'foo\bar\baz'.count('\\')# => 2  # Count of '\'.

These usages may be mixed:

s ='abracadabra's.count('a-cq-t')# => 10  # Multiple ranges.s.count('ac-d')# => 7   # Range mixed with plain characters.s.count('^a-c')# => 3   # Range mixed with negation.

For multiple selectors, all forms may be used, including negations, ranges, and escapes.

s ='abracadabra's.count('^abc','^def')==s.count('^abcdef')# => trues.count('a-e','c-g')==s.count('cde')# => trues.count('^abc','c-g')==s.count('defg')# => true

Related: seeQuerying.

crypt(salt_str) → new_string

Source

static VALUErb_str_crypt(VALUE str, VALUE salt){#ifdef HAVE_CRYPT_R    VALUE databuf;    struct crypt_data *data;#   define CRYPT_END() ALLOCV_END(databuf)#else    char *tmp_buf;    extern char *crypt(const char *, const char *);#   define CRYPT_END() rb_nativethread_lock_unlock(&crypt_mutex.lock)#endif    VALUE result;    const char *s, *saltp;    char *res;#ifdef BROKEN_CRYPT    char salt_8bit_clean[3];#endif    StringValue(salt);    mustnot_wchar(str);    mustnot_wchar(salt);    s = StringValueCStr(str);    saltp = RSTRING_PTR(salt);    if (RSTRING_LEN(salt) < 2 || !saltp[0] || !saltp[1]) {        rb_raise(rb_eArgError, "salt too short (need >=2 bytes)");    }#ifdef BROKEN_CRYPT    if (!ISASCII((unsigned char)saltp[0]) || !ISASCII((unsigned char)saltp[1])) {        salt_8bit_clean[0] = saltp[0] & 0x7f;        salt_8bit_clean[1] = saltp[1] & 0x7f;        salt_8bit_clean[2] = '\0';        saltp = salt_8bit_clean;    }#endif#ifdef HAVE_CRYPT_R    data = ALLOCV(databuf, sizeof(struct crypt_data));# ifdef HAVE_STRUCT_CRYPT_DATA_INITIALIZED    data->initialized = 0;# endif    res = crypt_r(s, saltp, data);#else    rb_nativethread_lock_lock(&crypt_mutex.lock);    res = crypt(s, saltp);#endif    if (!res) {        int err = errno;        CRYPT_END();        rb_syserr_fail(err, "crypt");    }#ifdef HAVE_CRYPT_R    result = rb_str_new_cstr(res);    CRYPT_END();#else    // We need to copy this buffer because it's static and we need to unlock the mutex    // before allocating a new object (the string to be returned). If we allocate while    // holding the lock, we could run GC which fires the VM barrier and causes a deadlock    // if other ractors are waiting on this lock.    size_t res_size = strlen(res)+1;    tmp_buf = ALLOCA_N(char, res_size); // should be small enough to alloca    memcpy(tmp_buf, res, res_size);    res = tmp_buf;    CRYPT_END();    result = rb_str_new_cstr(res);#endif    return result;}

Returns the string generated by callingcrypt(3) standard library function withstr andsalt_str, in this order, as its arguments. Please do not use this method any longer. It is legacy; provided only for backward compatibility with ruby scripts in earlier days. It is bad to use in contemporary programs for several reasons:

Behaviour of C’scrypt(3) depends on the OS it is run. The generated string lacks data portability.
On some OSes such as Mac OS,crypt(3) never fails (i.e. silently ends up in unexpected results).
On some OSes such as Mac OS,crypt(3) is not thread safe.
So-called “traditional” usage ofcrypt(3) is very very very weak. According to its manpage, Linux’s traditionalcrypt(3) output has only 2**56 variations; too easy to brute force today. And this is the default behaviour.
In order to make things robust some OSes implement so-called “modular” usage. To go through, you have to do a complex build-up of thesalt_str parameter, by hand. Failure in generation of a proper salt string tends not to yield any errors; typos in parameters are normally not detectable.
- For instance, in the following example, the second invocation ofString#crypt is wrong; it has a typo in “round=” (lacks “s”). However the call does not fail and something unexpected is generated.
```
"foo".crypt("$5$rounds=1000$salt$")# OK, proper usage"foo".crypt("$5$round=1000$salt$")# Typo not detected
```
Even in the “modular” mode, some hash functions are considered archaic and no longer recommended at all; for instance module $1$ is officially abandoned by its author: seephk.freebsd.dk/sagas/md5crypt_eol/ . For another instance module $3$ is considered completely broken: see the manpage of FreeBSD.
On some OS such as Mac OS, there is no modular mode. Yet, as written above,crypt(3) on Mac OS never fails. This means even if you build up a proper salt string it generates a traditional DES hash anyways, and there is no way for you to be aware of.
```
"foo".crypt("$5$rounds=1000$salt$")# => "$5fNPQMxC5j6."
```

If for some reason you cannot migrate to other secure contemporary password hashing algorithms, install the string-crypt gem andrequire 'string/crypt' to continue using it.

-self → frozen_string

Returns a frozen string equal toself.

The returned string isself if and only if all of the following are true:

self is already frozen.
self is an instance of String (rather than of a subclass of String)
self has no instance variables set on it.

Otherwise, the returned string is a frozen copy ofself.

Returningself, when possible, saves duplicatingself; seeData deduplication.

It may also save duplicating other, already-existing, strings:

s0 ='foo's1 ='foo's0.object_id==s1.object_id# => false(-s0).object_id== (-s1).object_id# => true

Note that method-@ is convenient for defining a constant:

FileName =-'config/database.yml'

While its aliasdedup is better suited for chaining:

'foo'.dedup.gsub!('o')

Related: seeFreezing/Unfreezing.

Alias for:-@

delete(*selectors) → new_string

Source

static VALUErb_str_delete(int argc, VALUE *argv, VALUE str){    str = str_duplicate(rb_cString, str);    rb_str_delete_bang(argc, argv, str);    return str;}

Returns a new string that is a copy ofself with certain characters removed; the removed characters are all instances of those specified by the given stringselectors.

For one 1-character selector, removes all instances of that character:

s ='abracadabra's.delete('a')# => "brcdbr"s.delete('b')# => "aracadara"s.delete('x')# => "abracadabra"s.delete('')# => "abracadabra"s ='тест's.delete('т')# => "ес"s.delete('е')# => "тст"s ='よろしくお願いします's.delete('よ')# => "ろしくお願いします"s.delete('し')# => "よろくお願います"

For one multi-character selector, removes all instances of the specified characters:

s ='abracadabra's.delete('ab')# => "rcdr"s.delete('abc')# => "rdr"s.delete('abcd')# => "rr"s.delete('abcdr')# => ""s.delete('abcdrx')# => ""

Order and repetition do not matter:

s.delete('ba')==s.delete('ab')# => trues.delete('baab')==s.delete('ab')# => true

For multiple selectors, forms a single selector that is the intersection of characters in all selectors and removes all instances of characters specified by that selector:

s ='abcdefg's.delete('abcde','dcbfg')==s.delete('bcd')# => trues.delete('abc','def')==s.delete('')# => true

In a character selector, three characters get special treatment:

A caret ('^') functions as anegation operator for the immediately following characters:
```
s ='abracadabra's.delete('^bc')# => "bcb"  # Deletes all except 'b' and 'c'.
```

A hyphen ('-') between two other characters defines arange of characters:

s ='abracadabra's.delete('a-c')# => "rdr"  # Deletes all 'a', 'b', and 'c'.

A backslash ('\') acts as an escape for a caret, a hyphen, or another backslash:

s ='abracadabra's.delete('\^bc')# => "araadara"   # Deletes all '^', 'b', and 'c'.s.delete('a\-c')# => "brdbr"      # Deletes all 'a', '-', and 'c'.'foo\bar\baz'.delete('\\')# => "foobarbaz"  # Deletes all '\'.

These usages may be mixed:

s ='abracadabra's.delete('a-cq-t')# => "d"         # Multiple ranges.s.delete('ac-d')# => "brbr"      # Range mixed with plain characters.s.delete('^a-c')# => "abacaaba"  # Range mixed with negation.

For multiple selectors, all forms may be used, including negations, ranges, and escapes.

s ='abracadabra's.delete('^abc','^def')==s.delete('^abcdef')# => trues.delete('a-e','c-g')==s.delete('cde')# => trues.delete('^abc','c-g')==s.delete('defg')# => true

Related: seeConverting to New String.

delete!(*selectors) → self or nil

Source

static VALUErb_str_delete_bang(int argc, VALUE *argv, VALUE str){    char squeez[TR_TABLE_SIZE];    rb_encoding *enc = 0;    char *s, *send, *t;    VALUE del = 0, nodel = 0;    int modify = 0;    int i, ascompat, cr;    if (RSTRING_LEN(str) == 0 || !RSTRING_PTR(str)) return Qnil;    rb_check_arity(argc, 1, UNLIMITED_ARGUMENTS);    for (i=0; i<argc; i++) {        VALUE s = argv[i];        StringValue(s);        enc = rb_enc_check(str, s);        tr_setup_table(s, squeez, i==0, &del, &nodel, enc);    }    str_modify_keep_cr(str);    ascompat = rb_enc_asciicompat(enc);    s = t = RSTRING_PTR(str);    send = RSTRING_END(str);    cr = ascompat ? ENC_CODERANGE_7BIT : ENC_CODERANGE_VALID;    while (s < send) {        unsigned int c;        int clen;        if (ascompat && (c = *(unsigned char*)s) < 0x80) {            if (squeez[c]) {                modify = 1;            }            else {                if (t != s) *t = c;                t++;            }            s++;        }        else {            c = rb_enc_codepoint_len(s, send, &clen, enc);            if (tr_find(c, squeez, del, nodel)) {                modify = 1;            }            else {                if (t != s) rb_enc_mbcput(c, t, enc);                t += clen;                if (cr == ENC_CODERANGE_7BIT) cr = ENC_CODERANGE_VALID;            }            s += clen;        }    }    TERM_FILL(t, TERM_LEN(str));    STR_SET_LEN(str, t - RSTRING_PTR(str));    ENC_CODERANGE_SET(str, cr);    if (modify) return str;    return Qnil;}

LikeString#delete, but modifiesself in place; returnsself if any characters were deleted,nil otherwise.

Related: seeModifying.

delete_prefix(prefix) → new_string

Source

static VALUErb_str_delete_prefix(VALUE str, VALUE prefix){    long prefixlen;    prefixlen = deleted_prefix_length(str, prefix);    if (prefixlen <= 0) return str_duplicate(rb_cString, str);    return rb_str_subseq(str, prefixlen, RSTRING_LEN(str) - prefixlen);}

Returns a copy ofself with leading substringprefix removed:

'oof'.delete_prefix('o')# => "of"'oof'.delete_prefix('oo')# => "f"'oof'.delete_prefix('oof')# => ""'oof'.delete_prefix('x')# => "oof"'тест'.delete_prefix('те')# => "ст"'こんにちは'.delete_prefix('こん')# => "にちは"

Related: seeConverting to New String.

delete_prefix!(prefix) → self or nil

Source

static VALUErb_str_delete_prefix_bang(VALUE str, VALUE prefix){    long prefixlen;    str_modify_keep_cr(str);    prefixlen = deleted_prefix_length(str, prefix);    if (prefixlen <= 0) return Qnil;    return rb_str_drop_bytes(str, prefixlen);}

LikeString#delete_prefix, except thatself is modified in place; returnsself if the prefix is removed,nil otherwise.

Related: seeModifying.

delete_suffix(suffix) → new_string

Source

static VALUErb_str_delete_suffix(VALUE str, VALUE suffix){    long suffixlen;    suffixlen = deleted_suffix_length(str, suffix);    if (suffixlen <= 0) return str_duplicate(rb_cString, str);    return rb_str_subseq(str, 0, RSTRING_LEN(str) - suffixlen);}

Returns a copy ofself with trailing substringsuffix removed:

'foo'.delete_suffix('o')# => "fo"'foo'.delete_suffix('oo')# => "f"'foo'.delete_suffix('foo')# => ""'foo'.delete_suffix('f')# => "foo"'foo'.delete_suffix('x')# => "foo"'тест'.delete_suffix('ст')# => "те"'こんにちは'.delete_suffix('ちは')# => "こんに"

Related: seeConverting to New String.

delete_suffix!(suffix) → self or nil

Source

static VALUErb_str_delete_suffix_bang(VALUE str, VALUE suffix){    long olen, suffixlen, len;    str_modifiable(str);    suffixlen = deleted_suffix_length(str, suffix);    if (suffixlen <= 0) return Qnil;    olen = RSTRING_LEN(str);    str_modify_keep_cr(str);    len = olen - suffixlen;    STR_SET_LEN(str, len);    TERM_FILL(&RSTRING_PTR(str)[len], TERM_LEN(str));    if (ENC_CODERANGE(str) != ENC_CODERANGE_7BIT) {        ENC_CODERANGE_CLEAR(str);    }    return str;}

LikeString#delete_suffix, except thatself is modified in place; returnsself if the suffix is removed,nil otherwise.

Related: seeModifying.

downcase(mapping) → string

Source

static VALUErb_str_downcase(int argc, VALUE *argv, VALUE str){    rb_encoding *enc;    OnigCaseFoldType flags = ONIGENC_CASE_DOWNCASE;    VALUE ret;    flags = check_case_options(argc, argv, flags);    enc = str_true_enc(str);    if (case_option_single_p(flags, enc, str)) {        ret = rb_str_new(RSTRING_PTR(str), RSTRING_LEN(str));        str_enc_copy_direct(ret, str);        downcase_single(ret);    }    else if (flags&ONIGENC_CASE_ASCII_ONLY) {        ret = rb_str_new(0, RSTRING_LEN(str));        rb_str_ascii_casemap(str, ret, &flags, enc);    }    else {        ret = rb_str_casemap(str, &flags, enc);    }    return ret;}

Returns a new string containing the downcased characters inself:

'Hello, World!'.downcase# => "hello, world!"'ТЕСТ'.downcase# => "тест"'よろしくお願いします'.downcase# => "よろしくお願いします"

Some characters do not have upcased and downcased versions.

The casing may be affected by the givenmapping; seeCase Mapping.

Related: seeConverting to New String.

downcase!(mapping) → self or nil

Source

static VALUErb_str_downcase_bang(int argc, VALUE *argv, VALUE str){    rb_encoding *enc;    OnigCaseFoldType flags = ONIGENC_CASE_DOWNCASE;    flags = check_case_options(argc, argv, flags);    str_modify_keep_cr(str);    enc = str_true_enc(str);    if (case_option_single_p(flags, enc, str)) {        if (downcase_single(str))            flags |= ONIGENC_CASE_MODIFIED;    }    else if (flags&ONIGENC_CASE_ASCII_ONLY)        rb_str_ascii_casemap(str, str, &flags, enc);    else        str_shared_replace(str, rb_str_casemap(str, &flags, enc));    if (ONIGENC_CASE_MODIFIED&flags) return str;    return Qnil;}

LikeString#downcase, except that:

Changes character casings inself (not in a copy ofself).
Returnsself if any changes are made,nil otherwise.

Related: SeeModifying.

dump → new_string

Source

VALUErb_str_dump(VALUE str){    int encidx = rb_enc_get_index(str);    rb_encoding *enc = rb_enc_from_index(encidx);    long len;    const char *p, *pend;    char *q, *qend;    VALUE result;    int u8 = (encidx == rb_utf8_encindex());    static const char nonascii_suffix[] = ".dup.force_encoding(\"%s\")";    len = 2;                    /* "" */    if (!rb_enc_asciicompat(enc)) {        len += strlen(nonascii_suffix) - rb_strlen_lit("%s");        len += strlen(enc->name);    }    p = RSTRING_PTR(str); pend = p + RSTRING_LEN(str);    while (p < pend) {        int clen;        unsigned char c = *p++;        switch (c) {          case '"':  case '\\':          case '\n': case '\r':          case '\t': case '\f':          case '\013': case '\010': case '\007': case '\033':            clen = 2;            break;          case '#':            clen = IS_EVSTR(p, pend) ? 2 : 1;            break;          default:            if (ISPRINT(c)) {                clen = 1;            }            else {                if (u8 && c > 0x7F) {   /* \u notation */                    int n = rb_enc_precise_mbclen(p-1, pend, enc);                    if (MBCLEN_CHARFOUND_P(n)) {                        unsigned int cc = rb_enc_mbc_to_codepoint(p-1, pend, enc);                        if (cc <= 0xFFFF)                            clen = 6;  /* \uXXXX */                        else if (cc <= 0xFFFFF)                            clen = 9;  /* \u{XXXXX} */                        else                            clen = 10; /* \u{XXXXXX} */                        p += MBCLEN_CHARFOUND_LEN(n)-1;                        break;                    }                }                clen = 4;       /* \xNN */            }            break;        }        if (clen > LONG_MAX - len) {            rb_raise(rb_eRuntimeError, "string size too big");        }        len += clen;    }    result = rb_str_new(0, len);    p = RSTRING_PTR(str); pend = p + RSTRING_LEN(str);    q = RSTRING_PTR(result); qend = q + len + 1;    *q++ = '"';    while (p < pend) {        unsigned char c = *p++;        if (c == '"' || c == '\\') {            *q++ = '\\';            *q++ = c;        }        else if (c == '#') {            if (IS_EVSTR(p, pend)) *q++ = '\\';            *q++ = '#';        }        else if (c == '\n') {            *q++ = '\\';            *q++ = 'n';        }        else if (c == '\r') {            *q++ = '\\';            *q++ = 'r';        }        else if (c == '\t') {            *q++ = '\\';            *q++ = 't';        }        else if (c == '\f') {            *q++ = '\\';            *q++ = 'f';        }        else if (c == '\013') {            *q++ = '\\';            *q++ = 'v';        }        else if (c == '\010') {            *q++ = '\\';            *q++ = 'b';        }        else if (c == '\007') {            *q++ = '\\';            *q++ = 'a';        }        else if (c == '\033') {            *q++ = '\\';            *q++ = 'e';        }        else if (ISPRINT(c)) {            *q++ = c;        }        else {            *q++ = '\\';            if (u8) {                int n = rb_enc_precise_mbclen(p-1, pend, enc) - 1;                if (MBCLEN_CHARFOUND_P(n)) {                    int cc = rb_enc_mbc_to_codepoint(p-1, pend, enc);                    p += n;                    if (cc <= 0xFFFF)                        snprintf(q, qend-q, "u%04X", cc);    /* \uXXXX */                    else                        snprintf(q, qend-q, "u{%X}", cc);  /* \u{XXXXX} or \u{XXXXXX} */                    q += strlen(q);                    continue;                }            }            snprintf(q, qend-q, "x%02X", c);            q += 3;        }    }    *q++ = '"';    *q = '\0';    if (!rb_enc_asciicompat(enc)) {        snprintf(q, qend-q, nonascii_suffix, enc->name);        encidx = rb_ascii8bit_encindex();    }    /* result from dump is ASCII */    rb_enc_associate_index(result, encidx);    ENC_CODERANGE_SET(result, ENC_CODERANGE_7BIT);    return result;}

Returns a printable version ofself, enclosed in double-quotes:

'hello'.dump# => "\"hello\""

Certain special characters are rendered with escapes:

'"'.dump# => "\"\\\"\""'\\'.dump# => "\"\\\\\""

Non-printing characters are rendered with escapes:

s =''s<<7# Alarm (bell).s<<8# Back space.s<<9# Horizontal tab.s<<10# Line feed.s<<11# Vertical tab.s<<12# Form feed.s<<13# Carriage return.s# => "\a\b\t\n\v\f\r"s.dump# => "\"\\a\\b\\t\\n\\v\\f\\r\""

Ifself is encoded in UTF-8 and contains Unicode characters, renders Unicode characters in Unicode escape sequence:

'тест'.dump# => "\"\\u0442\\u0435\\u0441\\u0442\""'こんにちは'.dump# => "\"\\u3053\\u3093\\u306B\\u3061\\u306F\""

If the encoding ofself is not ASCII-compatible (i.e.,self.encoding.ascii_compatible? returnsfalse), renders all ASCII-compatible bytes as ASCII characters and all other bytes as hexadecimal. Appends.dup.force_encoding(\"encoding\"), where<encoding> isself.encoding.name:

s ='hello's.encoding# => #<Encoding:UTF-8>s.dump# => "\"hello\""s.encode('utf-16').dump# => "\"\\xFE\\xFF\\x00h\\x00e\\x00l\\x00l\\x00o\".dup.force_encoding(\"UTF-16\")"s.encode('utf-16le').dump# => "\"h\\x00e\\x00l\\x00l\\x00o\\x00\".dup.force_encoding(\"UTF-16LE\")"s ='тест's.encoding# => #<Encoding:UTF-8>s.dump# => "\"\\u0442\\u0435\\u0441\\u0442\""s.encode('utf-16').dump# => "\"\\xFE\\xFF\\x04B\\x045\\x04A\\x04B\".dup.force_encoding(\"UTF-16\")"s.encode('utf-16le').dump# => "\"B\\x045\\x04A\\x04B\\x04\".dup.force_encoding(\"UTF-16LE\")"s ='こんにちは's.encoding# => #<Encoding:UTF-8>s.dump# => "\"\\u3053\\u3093\\u306B\\u3061\\u306F\""s.encode('utf-16').dump# => "\"\\xFE\\xFF0S0\\x930k0a0o\".dup.force_encoding(\"UTF-16\")"s.encode('utf-16le').dump# => "\"S0\\x930k0a0o0\".dup.force_encoding(\"UTF-16LE\")"

Related: seeConverting to New String.

each_byte {|byte| ... } → self

each_byte → enumerator

Source

static VALUErb_str_each_byte(VALUE str){    RETURN_SIZED_ENUMERATOR(str, 0, 0, rb_str_each_byte_size);    return rb_str_enumerate_bytes(str, 0);}

With a block given, calls the block with each successive byte fromself; returnsself:

a = []'hello'.each_byte {|byte|a.push(byte) }# Five 1-byte characters.a# => [104, 101, 108, 108, 111]a = []'тест'.each_byte {|byte|a.push(byte) }# Four 2-byte characters.a# => [209, 130, 208, 181, 209, 129, 209, 130]a = []'こんにちは'.each_byte {|byte|a.push(byte) }# Five 3-byte characters.a# => [227, 129, 147, 227, 130, 147, 227, 129, 171, 227, 129, 161, 227, 129, 175]

With no block given, returns an enumerator.

Related: seeIterating.

each_char {|char| ... } → self

each_char → enumerator

Source

static VALUErb_str_each_char(VALUE str){    RETURN_SIZED_ENUMERATOR(str, 0, 0, rb_str_each_char_size);    return rb_str_enumerate_chars(str, 0);}

With a block given, calls the block with each successive character fromself; returnsself:

a = []'hello'.each_chardo|char|a.push(char)enda# => ["h", "e", "l", "l", "o"]a = []'тест'.each_chardo|char|a.push(char)enda# => ["т", "е", "с", "т"]a = []'こんにちは'.each_chardo|char|a.push(char)enda# => ["こ", "ん", "に", "ち", "は"]

With no block given, returns an enumerator.

Related: seeIterating.

each_codepoint {|codepoint| ... } → self

each_codepoint → enumerator

Source

static VALUErb_str_each_codepoint(VALUE str){    RETURN_SIZED_ENUMERATOR(str, 0, 0, rb_str_each_char_size);    return rb_str_enumerate_codepoints(str, 0);}

With a block given, calls the block with each successive codepoint fromself; eachcodepoint is the integer value for a character; returnsself:

a = []'hello'.each_codepointdo|codepoint|a.push(codepoint)enda# => [104, 101, 108, 108, 111]a = []'тест'.each_codepointdo|codepoint|a.push(codepoint)enda# => [1090, 1077, 1089, 1090]a = []'こんにちは'.each_codepointdo|codepoint|a.push(codepoint)enda# => [12371, 12435, 12395, 12385, 12399]

With no block given, returns an enumerator.

Related: seeIterating.

each_grapheme_cluster {|grapheme_cluster| ... } → self

each_grapheme_cluster → enumerator

Source

static VALUErb_str_each_grapheme_cluster(VALUE str){    RETURN_SIZED_ENUMERATOR(str, 0, 0, rb_str_each_grapheme_cluster_size);    return rb_str_enumerate_grapheme_clusters(str, 0);}

With a block given, calls the given block with each successive grapheme cluster fromself (seeUnicode Grapheme Cluster Boundaries); returnsself:

a = []'hello'.each_grapheme_clusterdo|grapheme_cluster|a.push(grapheme_cluster)enda# => ["h", "e", "l", "l", "o"]a = []'тест'.each_grapheme_clusterdo|grapheme_cluster|a.push(grapheme_cluster)enda# => ["т", "е", "с", "т"]a = []'こんにちは'.each_grapheme_clusterdo|grapheme_cluster|a.push(grapheme_cluster)enda# => ["こ", "ん", "に", "ち", "は"]

With no block given, returns an enumerator.

Related: seeIterating.

each_line(record_separator = $/, chomp: false) {|substring| ... } → self

each_line(record_separator = $/, chomp: false) → enumerator

Source

static VALUErb_str_each_line(int argc, VALUE *argv, VALUE str){    RETURN_SIZED_ENUMERATOR(str, argc, argv, 0);    return rb_str_enumerate_lines(argc, argv, str, 0);}

With a block given, forms the substrings (lines) that are the result of splittingself at each occurrence of the givenrecord_separator; passes each line to the block; returnsself.

With the defaultrecord_separator:

$/# => "\n"s =<<~EOTThis is the first line.This is line two.This is line four.This is line five.EOTs.each_line {|line|pline }

Output:

"This is the first line.\n""This is line two.\n""\n""This is line four.\n""This is line five.\n"

With a differentrecord_separator:

record_separator =' is 's.each_line(record_separator) {|line|pline }

Output:

"This is ""the first line.\nThis is ""line two.\n\nThis is ""line four.\nThis is ""line five.\n"

Withchomp astrue, removes the trailingrecord_separator from each line:

s.each_line(chomp:true) {|line|pline }

Output:

"This is the first line.""This is line two.""""This is line four.""This is line five."

With an empty string asrecord_separator, forms and passes “paragraphs” by splitting at each occurrence of two or more newlines:

record_separator =''s.each_line(record_separator) {|line|pline }

Output:

"This is the first line.\nThis is line two.\n\n""This is line four.\nThis is line five.\n"

With no block given, returns an enumerator.

Related: seeIterating.

empty? → true or false

Source

static VALUErb_str_empty(VALUE str){    return RBOOL(RSTRING_LEN(str) == 0);}

Returns whether the length ofself is zero:

'hello'.empty?# => false' '.empty?# => false''.empty?# => true

Related: seeQuerying.

encode(dst_encoding = Encoding.default_internal, **enc_opts) → string

encode(dst_encoding, src_encoding, **enc_opts) → string

Source

static VALUEstr_encode(int argc, VALUE *argv, VALUE str){    VALUE newstr = str;    int encidx = str_transcode(argc, argv, &newstr);    return encoded_dup(newstr, str, encidx);}

Returns a copy ofself transcoded as determined bydst_encoding; seeEncodings.

By default, raises an exception ifself contains an invalid byte or a character not defined indst_encoding; that behavior may be modified by encoding options; see below.

With no arguments:

Uses the same encoding ifEncoding.default_internal isnil (the default):

Encoding.default_internal# => nils ="Ruby\x99".force_encoding('Windows-1252')s.encoding# => #<Encoding:Windows-1252>s.bytes# => [82, 117, 98, 121, 153]t =s.encode# => "Ruby\x99"t.encoding# => #<Encoding:Windows-1252>t.bytes# => [82, 117, 98, 121, 226, 132, 162]

Otherwise, uses the encodingEncoding.default_internal:

Encoding.default_internal ='UTF-8't =s.encode# => "Ruby™"t.encoding# => #<Encoding:UTF-8>

With only argumentdst_encoding given, uses that encoding:

s ="Ruby\x99".force_encoding('Windows-1252')s.encoding# => #<Encoding:Windows-1252>t =s.encode('UTF-8')# => "Ruby™"t.encoding# => #<Encoding:UTF-8>

With argumentsdst_encoding andsrc_encoding given, interpretsself usingsrc_encoding, encodes the new string usingdst_encoding:

s ="Ruby\x99"t =s.encode('UTF-8','Windows-1252')# => "Ruby™"t.encoding# => #<Encoding:UTF-8>

Optional keyword argumentsenc_opts specify encoding options; seeEncoding Options.

Please note that, unlessinvalid: :replace option is given, conversion from an encodingenc to the same encodingenc (independent of whetherenc is given explicitly or implicitly) is a no-op, i.e. the string is simply copied without any changes, and no exceptions are raised, even if there are invalid bytes.

Related: seeConverting to New String.

encode!(dst_encoding = Encoding.default_internal, **enc_opts) → self

encode!(dst_encoding, src_encoding, **enc_opts) → self

Source

static VALUEstr_encode_bang(int argc, VALUE *argv, VALUE str){    VALUE newstr;    int encidx;    rb_check_frozen(str);    newstr = str;    encidx = str_transcode(argc, argv, &newstr);    if (encidx < 0) return str;    if (newstr == str) {        rb_enc_associate_index(str, encidx);        return str;    }    rb_str_shared_replace(str, newstr);    return str_encode_associate(str, encidx);}

Likeencode, but applies encoding changes toself; returnsself.

Related: seeModifying.

encoding → encoding

Source

VALUErb_obj_encoding(VALUE obj){    int idx = rb_enc_get_index(obj);    if (idx < 0) {        rb_raise(rb_eTypeError, "unknown encoding");    }    return rb_enc_from_encoding_index(idx & ENC_INDEX_MASK);}

Returns anEncoding object that represents the encoding ofself; seeEncodings.

Related: seeQuerying.

end_with?(*strings) → true or false

Source

static VALUErb_str_end_with(int argc, VALUE *argv, VALUE str){    int i;    for (i=0; i<argc; i++) {        VALUE tmp = argv[i];        const char *p, *s, *e;        long slen, tlen;        rb_encoding *enc;        StringValue(tmp);        enc = rb_enc_check(str, tmp);        if ((tlen = RSTRING_LEN(tmp)) == 0) return Qtrue;        if ((slen = RSTRING_LEN(str)) < tlen) continue;        p = RSTRING_PTR(str);        e = p + slen;        s = e - tlen;        if (!at_char_boundary(p, s, e, enc))            continue;        if (memcmp(s, RSTRING_PTR(tmp), tlen) == 0)            return Qtrue;    }    return Qfalse;}

Returns whetherself ends with any of the givenstrings:

'foo'.end_with?('oo')# => true'foo'.end_with?('bar','oo')# => true'foo'.end_with?('bar','baz')# => false'foo'.end_with?('')# => true'тест'.end_with?('т')# => true'こんにちは'.end_with?('は')# => true

Related: seeQuerying.

eql?(object) → true or false

Source

VALUErb_str_eql(VALUE str1, VALUE str2){    if (str1 == str2) return Qtrue;    if (!RB_TYPE_P(str2, T_STRING)) return Qfalse;    return rb_str_eql_internal(str1, str2);}

Returns whetherself andobject have the same length and content:

s ='foo's.eql?('foo')# => trues.eql?('food')# => falses.eql?('FOO')# => false

Returnsfalse if the two strings’ encodings are not compatible:

s0 ="äöü"# => "äöü"s1 =s0.encode(Encoding::ISO_8859_1)# => "\xE4\xF6\xFC"s0.encoding# => #<Encoding:UTF-8>s1.encoding# => #<Encoding:ISO-8859-1>s0.eql?(s1)# => false

SeeEncodings.

Related: seeQuerying.

force_encoding(encoding) → self

Source

static VALUErb_str_force_encoding(VALUE str, VALUE enc){    str_modifiable(str);    rb_encoding *encoding = rb_to_encoding(enc);    int idx = rb_enc_to_index(encoding);    // If the encoding is unchanged, we do nothing.    if (ENCODING_GET(str) == idx) {        return str;    }    rb_enc_associate_index(str, idx);    // If the coderange was 7bit and the new encoding is ASCII-compatible    // we can keep the coderange.    if (ENC_CODERANGE(str) == ENC_CODERANGE_7BIT && encoding && rb_enc_asciicompat(encoding)) {        return str;    }    ENC_CODERANGE_CLEAR(str);    return str;}

Changes the encoding ofself to the givenencoding, which may be a string encoding name or anEncoding object; does not change the underlying bytes; returns self:

s ='łał's.bytes# => [197, 130, 97, 197, 130]s.encoding# => #<Encoding:UTF-8>s.force_encoding('ascii')# => "\xC5\x82a\xC5\x82"s.encoding# => #<Encoding:US-ASCII>s.valid_encoding?# => trues.bytes# => [197, 130, 97, 197, 130]

Makes the change even if the givenencoding is invalid forself (as is the change above):

s.valid_encoding?# => false

SeeEncodings.

Related: seeModifying.

getbyte(index) → integer or nil

Source

VALUErb_str_getbyte(VALUE str, VALUE index){    long pos = NUM2LONG(index);    if (pos < 0)        pos += RSTRING_LEN(str);    if (pos < 0 ||  RSTRING_LEN(str) <= pos)        return Qnil;    return INT2FIX((unsigned char)RSTRING_PTR(str)[pos]);}

Returns the byte at zero-basedindex as an integer:

s ='foo's.getbyte(0)# => 102s.getbyte(1)# => 111s.getbyte(2)# => 111

Counts backward from the end ifindex is negative:

s.getbyte(-3)# => 102

Returnsnil ifindex is out of range:

s.getbyte(3)# => nils.getbyte(-4)# => nil

More examples:

s ='тест's.bytes# => [209, 130, 208, 181, 209, 129, 209, 130]s.getbyte(2)# => 208s ='こんにちは's.bytes# => [227, 129, 147, 227, 130, 147, 227, 129, 171, 227, 129, 161, 227, 129, 175]s.getbyte(2)# => 147

Related: seeConverting to Non-String.

grapheme_clusters → array_of_grapheme_clusters

Source

static VALUErb_str_grapheme_clusters(VALUE str){    VALUE ary = WANTARRAY("grapheme_clusters", rb_str_strlen(str));    return rb_str_enumerate_grapheme_clusters(str, ary);}

Returns an array of the grapheme clusters inself (seeUnicode Grapheme Cluster Boundaries):

s ="ä-pqr-b̈-xyz-c̈"s.size# => 16s.bytesize# => 19s.grapheme_clusters.size# => 13s.grapheme_clusters# => ["ä", "-", "p", "q", "r", "-", "b̈", "-", "x", "y", "z", "-", "c̈"]

Details:

s ="ä"s.grapheme_clusters# => ["ä"]           # One grapheme cluster.s.bytes# => [97, 204, 136]  # Three bytes.s.chars# => ["a", "̈"]       # Two characters.s.chars.map {|char|char.ord }# => [97, 776]       # Their values.

Related: seeConverting to Non-String.

gsub(pattern, replacement) → new_string

gsub(pattern) {|match| ... } → new_string

gsub(pattern) → enumerator

Source

static VALUErb_str_gsub(int argc, VALUE *argv, VALUE str){    return str_gsub(argc, argv, str, 0);}

Returns a copy ofself with zero or more substrings replaced.

Argumentpattern may be a string or aRegexp; argumentreplacement may be a string or aHash. Varying types for the argument values makes this method very versatile.

Below are some simple examples; for many more examples, seeSubstitution Methods.

With argumentspattern and stringreplacement given, replaces each matching substring with the givenreplacement string:

s ='abracadabra's.gsub('ab','AB')# => "ABracadABra"s.gsub(/[a-c]/,'X')# => "XXrXXXdXXrX"

With argumentspattern and hashreplacement given, replaces each matching substring with a value from the givenreplacement hash, or removes it:

h = {'a'=>'A','b'=>'B','c'=>'C'}s.gsub(/[a-c]/,h)# => "ABrACAdABrA"  # 'a', 'b', 'c' replaced.s.gsub(/[a-d]/,h)# => "ABrACAABrA"   # 'd' removed.

With argumentpattern and a block given, calls the block with each matching substring; replaces that substring with the block’s return value:

s.gsub(/[a-d]/) {|substring|substring.upcase }# => "ABrACADABrA"

With argumentpattern and no block given, returns a newEnumerator.

Related: seeConverting to New String.

gsub!(pattern, replacement) → self or nil

gsub!(pattern) {|match| ... } → self or nil

gsub!(pattern) → an_enumerator

Source

static VALUErb_str_gsub_bang(int argc, VALUE *argv, VALUE str){    str_modify_keep_cr(str);    return str_gsub(argc, argv, str, 1);}

LikeString#gsub, except that:

Performs substitutions inself (not in a copy ofself).
Returnsself if any characters are removed,nil otherwise.

Related: seeModifying.

hash → integer

Source

static VALUErb_str_hash_m(VALUE str){    st_index_t hval = rb_str_hash(str);    return ST2FIX(hval);}

Returns the integer hash value forself.

Two String objects that have identical content and compatible encodings also have the same hash value; seeObject#hash andEncodings:

s ='foo'h =s.hash# => -569050784h=='foo'.hash# => trueh=='food'.hash# => falseh=='FOO'.hash# => falses0 ="äöü"s1 =s0.encode(Encoding::ISO_8859_1)s0.encoding# => #<Encoding:UTF-8>s1.encoding# => #<Encoding:ISO-8859-1>s0.hash==s1.hash# => false

Related: seeQuerying.

hex → integer

Source

static VALUErb_str_hex(VALUE str){    return rb_str_to_inum(str, 16, FALSE);}

Interprets the leading substring ofself as hexadecimal, possibly signed; returns its value as an integer.

The leading substring is interpreted as hexadecimal when it begins with:

One or more character representing hexadecimal digits (each in one of the ranges'0'..'9','a'..'f', or'A'..'F'); the string to be interpreted ends at the first character that does not represent a hexadecimal digit:
```
'f'.hex# => 15'11'.hex# => 17'FFF'.hex# => 4095'fffg'.hex# => 4095'foo'.hex# => 15   # 'f' hexadecimal, 'oo' not.'bar'.hex# => 186  # 'ba' hexadecimal, 'r' not.'deadbeef'.hex# => 3735928559
```
'0x' or'0X', followed by one or more hexadecimal digits:
```
'0xfff'.hex# => 4095'0xfffg'.hex# => 4095
```

Any of the above may prefixed with'-', which negates the interpreted value:

'-fff'.hex# => -4095'-0xFFF'.hex# => -4095

For any substring not described above, returns zero:

'xxx'.hex# => 0''.hex# => 0

Note that, unlikeoct, this method interprets only hexadecimal, and not binary, octal, or decimal notations:

'0b111'.hex# => 45329'0o777'.hex# => 0'0d999'.hex# => 55705

Related: SeeConverting to Non-String.

include?(other_string) → true or false

Source

VALUErb_str_include(VALUE str, VALUE arg){    long i;    StringValue(arg);    i = rb_str_index(str, arg, 0);    return RBOOL(i != -1);}

Returns whetherself containsother_string:

s ='bar's.include?('ba')# => trues.include?('ar')# => trues.include?('bar')# => trues.include?('a')# => trues.include?('')# => trues.include?('foo')# => false

Related: seeQuerying.

index(pattern, offset = 0) → integer or nil

Source

static VALUErb_str_index_m(int argc, VALUE *argv, VALUE str){    VALUE sub;    VALUE initpos;    rb_encoding *enc = STR_ENC_GET(str);    long pos;    if (rb_scan_args(argc, argv, "11", &sub, &initpos) == 2) {        long slen = str_strlen(str, enc); /* str's enc */        pos = NUM2LONG(initpos);        if (pos < 0 ? (pos += slen) < 0 : pos > slen) {            if (RB_TYPE_P(sub, T_REGEXP)) {                rb_backref_set(Qnil);            }            return Qnil;        }    }    else {        pos = 0;    }    if (RB_TYPE_P(sub, T_REGEXP)) {        pos = str_offset(RSTRING_PTR(str), RSTRING_END(str), pos,                         enc, single_byte_optimizable(str));        if (rb_reg_search(sub, str, pos, 0) >= 0) {            VALUE match = rb_backref_get();            struct re_registers *regs = RMATCH_REGS(match);            pos = rb_str_sublen(str, BEG(0));            return LONG2NUM(pos);        }    }    else {        StringValue(sub);        pos = rb_str_index(str, sub, pos);        if (pos >= 0) {            pos = rb_str_sublen(str, pos);            return LONG2NUM(pos);        }    }    return Qnil;}

Returns the integer position of the first substring that matches the given argumentpattern, ornil if none found.

Whenpattern is a string, returns the index of the first matching substring inself:

'foo'.index('f')# => 0'foo'.index('o')# => 1'foo'.index('oo')# => 1'foo'.index('ooo')# => nil'тест'.index('с')# => 2  # Characters, not bytes.'こんにちは'.index('ち')# => 3

Whenpattern is aRegexp, returns the index of the first match inself:

'foo'.index(/o./)# => 1'foo'.index(/.o/)# => 0

Whenoffset is non-negative, begins the search at positionoffset; the returned index is relative to the beginning ofself:

'bar'.index('r',0)# => 2'bar'.index('r',1)# => 2'bar'.index('r',2)# => 2'bar'.index('r',3)# => nil'bar'.index(/[r-z]/,0)# => 2'тест'.index('с',1)# => 2'тест'.index('с',2)# => 2'тест'.index('с',3)# => nil  # Offset in characters, not bytes.'こんにちは'.index('ち',2)# => 3

With negative integer argumentoffset, selects the search position by counting backward from the end ofself:

'foo'.index('o',-1)# => 2'foo'.index('o',-2)# => 1'foo'.index('o',-3)# => 1'foo'.index('o',-4)# => nil'foo'.index(/o./,-2)# => 1'foo'.index(/.o/,-2)# => 1

Related: seeQuerying.

initialize_copy

Alias for:replace

insert(offset, other_string) → self

Source

static VALUErb_str_insert(VALUE str, VALUE idx, VALUE str2){    long pos = NUM2LONG(idx);    if (pos == -1) {        return rb_str_append(str, str2);    }    else if (pos < 0) {        pos++;    }    rb_str_update(str, pos, 0, str2);    return str;}

Inserts the givenother_string intoself; returnsself.

If the givenindex is non-negative, insertsother_string at offsetindex:

'foo'.insert(0,'bar')# => "barfoo"'foo'.insert(1,'bar')# => "fbaroo"'foo'.insert(3,'bar')# => "foobar"'тест'.insert(2,'bar')# => "теbarст"  # Characters, not bytes.'こんにちは'.insert(2,'bar')# => "こんbarにちは"

If theindex is negative, counts backward from the end ofself and insertsother_stringafter the offset:

'foo'.insert(-2,'bar')# => "fobaro"

Related: seeModifying.

inspect → string

Source

VALUErb_str_inspect(VALUE str){    int encidx = ENCODING_GET(str);    rb_encoding *enc = rb_enc_from_index(encidx);    const char *p, *pend, *prev;    char buf[CHAR_ESC_LEN + 1];    VALUE result = rb_str_buf_new(0);    rb_encoding *resenc = rb_default_internal_encoding();    int unicode_p = rb_enc_unicode_p(enc);    int asciicompat = rb_enc_asciicompat(enc);    if (resenc == NULL) resenc = rb_default_external_encoding();    if (!rb_enc_asciicompat(resenc)) resenc = rb_usascii_encoding();    rb_enc_associate(result, resenc);    str_buf_cat2(result, "\"");    p = RSTRING_PTR(str); pend = RSTRING_END(str);    prev = p;    while (p < pend) {        unsigned int c, cc;        int n;        n = rb_enc_precise_mbclen(p, pend, enc);        if (!MBCLEN_CHARFOUND_P(n)) {            if (p > prev) str_buf_cat(result, prev, p - prev);            n = rb_enc_mbminlen(enc);            if (pend < p + n)                n = (int)(pend - p);            while (n--) {                snprintf(buf, CHAR_ESC_LEN, "\\x%02X", *p & 0377);                str_buf_cat(result, buf, strlen(buf));                prev = ++p;            }            continue;        }        n = MBCLEN_CHARFOUND_LEN(n);        c = rb_enc_mbc_to_codepoint(p, pend, enc);        p += n;        if ((asciicompat || unicode_p) &&          (c == '"'|| c == '\\' ||            (c == '#' &&             p < pend &&             MBCLEN_CHARFOUND_P(rb_enc_precise_mbclen(p,pend,enc)) &&             (cc = rb_enc_codepoint(p,pend,enc),              (cc == '$' || cc == '@' || cc == '{'))))) {            if (p - n > prev) str_buf_cat(result, prev, p - n - prev);            str_buf_cat2(result, "\\");            if (asciicompat || enc == resenc) {                prev = p - n;                continue;            }        }        switch (c) {          case '\n': cc = 'n'; break;          case '\r': cc = 'r'; break;          case '\t': cc = 't'; break;          case '\f': cc = 'f'; break;          case '\013': cc = 'v'; break;          case '\010': cc = 'b'; break;          case '\007': cc = 'a'; break;          case 033: cc = 'e'; break;          default: cc = 0; break;        }        if (cc) {            if (p - n > prev) str_buf_cat(result, prev, p - n - prev);            buf[0] = '\\';            buf[1] = (char)cc;            str_buf_cat(result, buf, 2);            prev = p;            continue;        }        /* The special casing of 0x85 (NEXT_LINE) here is because         * Oniguruma historically treats it as printable, but it         * doesn't match the print POSIX bracket class or character         * property in regexps.         *         * See Ruby Bug #16842 for details:         * https://bugs.ruby-lang.org/issues/16842         */        if ((enc == resenc && rb_enc_isprint(c, enc) && c != 0x85) ||            (asciicompat && rb_enc_isascii(c, enc) && ISPRINT(c))) {            continue;        }        else {            if (p - n > prev) str_buf_cat(result, prev, p - n - prev);            rb_str_buf_cat_escaped_char(result, c, unicode_p);            prev = p;            continue;        }    }    if (p > prev) str_buf_cat(result, prev, p - prev);    str_buf_cat2(result, "\"");    return result;}

Returns a printable version ofself, enclosed in double-quotes.

Most printable characters are rendered simply as themselves:

'abc'.inspect# => "\"abc\""'012'.inspect# => "\"012\""''.inspect# => "\"\"""\u000012".inspect# => "\"\\u000012\""'тест'.inspect# => "\"тест\""'こんにちは'.inspect# => "\"こんにちは\""

But printable characters double-quote ('"') and backslash and ('\') are escaped:

'"'.inspect# => "\"\\\"\""'\\'.inspect# => "\"\\\\\""

Unprintable characters are theASCII characters whose values are in range0..31, along with the character whose value is127.

Most of these characters are rendered thus:

0.chr.inspect# => "\"\\x00\""1.chr.inspect# => "\"\\x01\""2.chr.inspect# => "\"\\x02\""# ...

A few, however, have special renderings:

7.chr.inspect# => "\"\\a\""  # BEL8.chr.inspect# => "\"\\b\""  # BS9.chr.inspect# => "\"\\t\""  # TAB10.chr.inspect# => "\"\\n\""  # LF11.chr.inspect# => "\"\\v\""  # VT12.chr.inspect# => "\"\\f\""  # FF13.chr.inspect# => "\"\\r\""  # CR27.chr.inspect# => "\"\\e\""  # ESC

Related: seeConverting to Non-String.

intern → symbol

Source

VALUErb_str_intern(VALUE str){    return sym_find_or_insert_dynamic_symbol(&ruby_global_symbols, str);}

Returns theSymbol object derived fromself, creating it if it did not already exist:

'foo'.intern# => :foo'тест'.intern# => :тест'こんにちは'.intern# => :こんにちは

Related: seeConverting to Non-String.

Also aliased as:to_sym

length → integer

Source

VALUErb_str_length(VALUE str){    return LONG2NUM(str_strlen(str, NULL));}

Returns the count of characters (not bytes) inself:

'foo'.length# => 3'тест'.length# => 4'こんにちは'.length# => 5

Contrast withString#bytesize:

'foo'.bytesize# => 3'тест'.bytesize# => 8'こんにちは'.bytesize# => 15

Related: seeQuerying.

Also aliased as:size

lines(record_separator = $/, chomp: false) → array_of_strings

Source

static VALUErb_str_lines(int argc, VALUE *argv, VALUE str){    VALUE ary = WANTARRAY("lines", 0);    return rb_str_enumerate_lines(argc, argv, str, ary);}

Returns substrings (“lines”) ofself according to the given arguments:

s =<<~EOTThis is the first line.This is line two.This is line four.This is line five.EOT

With the default argument values:

$/# => "\n"s.lines# =>["This is the first line.\n","This is line two.\n","\n","This is line four.\n","This is line five.\n"]

With a differentrecord_separator:

record_separator =' is 's.lines(record_separator)# =>["This is ","the first line.\nThis is ","line two.\n\nThis is ","line four.\nThis is ","line five.\n"]

With keyword argumentchomp astrue, removes the trailing newline from each line:

s.lines(chomp:true)# =>["This is the first line.","This is line two.","","This is line four.","This is line five."]

Related: seeConverting to Non-String.

ljust(width, pad_string = ' ') → new_string

Source

static VALUErb_str_ljust(int argc, VALUE *argv, VALUE str){    return rb_str_justify(argc, argv, str, 'l');}

Returns a copy ofself, left-justified and, if necessary, right-padded with thepad_string:

'hello'.ljust(10)# => "hello     "'  hello'.ljust(10)# => "  hello   "'hello'.ljust(10,'ab')# => "helloababa"'тест'.ljust(10)# => "тест      "'こんにちは'.ljust(10)# => "こんにちは     "

Ifwidth <= self.length, returns a copy ofself:

'hello'.ljust(5)# => "hello"'hello'.ljust(1)# => "hello"  # Does not truncate to width.

Related: seeConverting to New String.

lstrip → new_string

Source

static VALUErb_str_lstrip(VALUE str){    char *start;    long len, loffset;    RSTRING_GETMEM(str, start, len);    loffset = lstrip_offset(str, start, start+len, STR_ENC_GET(str));    if (loffset <= 0) return str_duplicate(rb_cString, str);    return rb_str_subseq(str, loffset, len - loffset);}

Returns a copy ofself with leading whitespace removed; seeWhitespace in Strings:

whitespace ="\x00\t\n\v\f\r "s =whitespace+'abc'+whitespace# => "\u0000\t\n\v\f\r abc\u0000\t\n\v\f\r "s.lstrip# => "abc\u0000\t\n\v\f\r "

Related: seeConverting to New String.

lstrip! → self or nil

Source

static VALUErb_str_lstrip_bang(VALUE str){    rb_encoding *enc;    char *start, *s;    long olen, loffset;    str_modify_keep_cr(str);    enc = STR_ENC_GET(str);    RSTRING_GETMEM(str, start, olen);    loffset = lstrip_offset(str, start, start+olen, enc);    if (loffset > 0) {        long len = olen-loffset;        s = start + loffset;        memmove(start, s, len);        STR_SET_LEN(str, len);        TERM_FILL(start+len, rb_enc_mbminlen(enc));        return str;    }    return Qnil;}

LikeString#lstrip, except that:

Performs stripping inself (not in a copy ofself).
Returnsself if any characters are stripped,nil otherwise.

Related: seeModifying.

match(pattern, offset = 0) → matchdata or nil

match(pattern, offset = 0) {|matchdata| ... } → object

Source

static VALUErb_str_match_m(int argc, VALUE *argv, VALUE str){    VALUE re, result;    if (argc < 1)        rb_check_arity(argc, 1, 2);    re = argv[0];    argv[0] = str;    result = rb_funcallv(get_pat(re), rb_intern("match"), argc, argv);    if (!NIL_P(result) && rb_block_given_p()) {        return rb_yield(result);    }    return result;}

Creates aMatchData object based onself and the given arguments; updatesRegexp Global Variables.

Computesregexp by convertingpattern (if not already aRegexp).
```
regexp =Regexp.new(pattern)
```
Computesmatchdata, which will be either aMatchData object ornil (seeRegexp#match):
```
matchdata =regexp.match(self[offset..])
```

With no block given, returns the computedmatchdata ornil:

'foo'.match('f')# => #<MatchData "f">'foo'.match('o')# => #<MatchData "o">'foo'.match('x')# => nil'foo'.match('f',1)# => nil'foo'.match('o',1)# => #<MatchData "o">

With a block given and computedmatchdata non-nil, calls the block withmatchdata; returns the block’s return value:

'foo'.match(/o/) {|matchdata|matchdata }# => #<MatchData "o">

With a block given andnilmatchdata, does not call the block:

'foo'.match(/x/) {|matchdata|fail'Cannot happen' }# => nil

Related: seeQuerying.

match?(pattern, offset = 0) → true or false

Source

static VALUErb_str_match_m_p(int argc, VALUE *argv, VALUE str){    VALUE re;    rb_check_arity(argc, 1, 2);    re = get_pat(argv[0]);    return rb_reg_match_p(re, str, argc > 1 ? NUM2LONG(argv[1]) : 0);}

Returns whether a match is found forself and the given arguments; does not updateRegexp Global Variables.

Computesregexp by convertingpattern (if not already aRegexp):

regexp =Regexp.new(pattern)

Returnstrue ifself[offset..].match(regexp) returns aMatchData object,false otherwise:

'foo'.match?(/o/)# => true'foo'.match?('o')# => true'foo'.match?(/x/)# => false'foo'.match?('f',1)# => false'foo'.match?('o',1)# => true

Related: seeQuerying.

Alias for:succ

next!

Alias for:succ!

oct → integer

Source

static VALUErb_str_oct(VALUE str){    return rb_str_to_inum(str, -8, FALSE);}

Interprets the leading substring ofself as octal, binary, decimal, or hexadecimal, possibly signed; returns their value as an integer.

In brief:

# Interpreted as octal.'777'.oct# => 511'777x'.oct# => 511'0777'.oct# => 511'0o777'.oct# => 511'-777'.oct# => -511# Not interpreted as octal.'0b111'.oct# => 7     # Interpreted as binary.'0d999'.oct# => 999   # Interpreted as decimal.'0xfff'.oct# => 4095  # Interpreted as hexadecimal.

The leading substring is interpreted as octal when it begins with:

One or more character representing octal digits (each in the range'0'..'7'); the string to be interpreted ends at the first character that does not represent an octal digit:
```
'7'.oct      @ => 7'11'.oct     # => 9'777'.oct    # => 511'0777'.oct   # => 511'7778'.oct   # => 511'777x'.oct   # => 511
```
'0o', followed by one or more octal digits:
```
'0o777'.oct# => 511'0o7778'.oct# => 511
```

The leading substring isnot interpreted as octal when it begins with:

'0b', followed by one or more characters representing binary digits (each in the range'0'..'1'); the string to be interpreted ends at the first character that does not represent a binary digit. the string is interpreted as binary digits (base 2):
```
'0b111'.oct# => 7'0b1112'.oct# => 7
```
'0d', followed by one or more characters representing decimal digits (each in the range'0'..'9'); the string to be interpreted ends at the first character that does not represent a decimal digit. the string is interpreted as decimal digits (base 10):
```
'0d999'.oct# => 999'0d999x'.oct# => 999
```
'0x', followed by one or more characters representing hexadecimal digits (each in one of the ranges'0'..'9','a'..'f', or'A'..'F'); the string to be interpreted ends at the first character that does not represent a hexadecimal digit. the string is interpreted as hexadecimal digits (base 16):
```
'0xfff'.oct# => 4095'0xfffg'.oct# => 4095
```

Any of the above may prefixed with'-', which negates the interpreted value:

'-777'.oct# => -511'-0777'.oct# => -511'-0b111'.oct# => -7'-0xfff'.oct# => -4095

For any substring not described above, returns zero:

'foo'.oct# => 0''.oct# => 0

Related: seeConverting to Non-String.

ord → integer

Source

static VALUErb_str_ord(VALUE s){    unsigned int c;    c = rb_enc_codepoint(RSTRING_PTR(s), RSTRING_END(s), STR_ENC_GET(s));    return UINT2NUM(c);}

Returns the integer ordinal of the first character ofself:

'h'.ord# => 104'hello'.ord# => 104'тест'.ord# => 1090'こんにちは'.ord# => 12371

Related: seeConverting to Non-String.

partition(pattern) → [pre_match, first_match, post_match]

Source

static VALUErb_str_partition(VALUE str, VALUE sep){    long pos;    sep = get_pat_quoted(sep, 0);    if (RB_TYPE_P(sep, T_REGEXP)) {        if (rb_reg_search(sep, str, 0, 0) < 0) {            goto failed;        }        VALUE match = rb_backref_get();        struct re_registers *regs = RMATCH_REGS(match);        pos = BEG(0);        sep = rb_str_subseq(str, pos, END(0) - pos);    }    else {        pos = rb_str_index(str, sep, 0);        if (pos < 0) goto failed;    }    return rb_ary_new3(3, rb_str_subseq(str, 0, pos),                          sep,                          rb_str_subseq(str, pos+RSTRING_LEN(sep),                                             RSTRING_LEN(str)-pos-RSTRING_LEN(sep)));  failed:    return rb_ary_new3(3, str_duplicate(rb_cString, str), str_new_empty_String(str), str_new_empty_String(str));}

Returns a 3-element array of substrings ofself.

Ifpattern is matched, returns the array:

[pre_match,first_match,post_match]

where:

first_match is the first-found matching substring.
pre_match andpost_match are the preceding and following substrings.

Ifpattern is not matched, returns the array:

[self.dup,"",""]

Note that in the examples below, a returned string'hello' is a copy ofself, notself.

Ifpattern is aRegexp, performs the equivalent ofself.match(pattern) (also settingpattern-matching global variables):

'hello'.partition(/h/)# => ["", "h", "ello"]'hello'.partition(/l/)# => ["he", "l", "lo"]'hello'.partition(/l+/)# => ["he", "ll", "o"]'hello'.partition(/o/)# => ["hell", "o", ""]'hello'.partition(/^/)# => ["", "", "hello"]'hello'.partition(//)# => ["", "", "hello"]'hello'.partition(/$/)# => ["hello", "", ""]'hello'.partition(/x/)# => ["hello", "", ""]

Ifpattern is not aRegexp, converts it to a string (if it is not already one), then performs the equivalent ofself.index(pattern) (and doesnot setpattern-matching global variables):

'hello'.partition('h')# => ["", "h", "ello"]'hello'.partition('l')# => ["he", "l", "lo"]'hello'.partition('ll')# => ["he", "ll", "o"]'hello'.partition('o')# => ["hell", "o", ""]'hello'.partition('')# => ["", "", "hello"]'hello'.partition('x')# => ["hello", "", ""]'тест'.partition('т')# => ["", "т", "ест"]'こんにちは'.partition('に')# => ["こん", "に", "ちは"]

Related: seeConverting to Non-String.

prepend(*other_strings) → new_string

Source

static VALUErb_str_prepend_multi(int argc, VALUE *argv, VALUE str){    str_modifiable(str);    if (argc == 1) {        rb_str_update(str, 0L, 0L, argv[0]);    }    else if (argc > 1) {        int i;        VALUE arg_str = rb_str_tmp_new(0);        rb_enc_copy(arg_str, str);        for (i = 0; i < argc; i++) {            rb_str_append(arg_str, argv[i]);        }        rb_str_update(str, 0L, 0L, arg_str);    }    return str;}

Prefixes toself the concatenation of the givenother_strings; returnsself:

'baz'.prepend('foo','bar')# => "foobarbaz"

Related: seeModifying.

replace(other_string) → self

Source

VALUErb_str_replace(VALUE str, VALUE str2){    str_modifiable(str);    if (str == str2) return str;    StringValue(str2);    str_discard(str);    return str_replace(str, str2);}

Replaces the contents ofself with the contents ofother_string; returnsself:

s ='foo'# => "foo"s.replace('bar')# => "bar"

Related: seeModifying.

Also aliased as:initialize_copy

reverse → new_string

Source

static VALUErb_str_reverse(VALUE str){    rb_encoding *enc;    VALUE rev;    char *s, *e, *p;    int cr;    if (RSTRING_LEN(str) <= 1) return str_duplicate(rb_cString, str);    enc = STR_ENC_GET(str);    rev = rb_str_new(0, RSTRING_LEN(str));    s = RSTRING_PTR(str); e = RSTRING_END(str);    p = RSTRING_END(rev);    cr = ENC_CODERANGE(str);    if (RSTRING_LEN(str) > 1) {        if (single_byte_optimizable(str)) {            while (s < e) {                *--p = *s++;            }        }        else if (cr == ENC_CODERANGE_VALID) {            while (s < e) {                int clen = rb_enc_fast_mbclen(s, e, enc);                p -= clen;                memcpy(p, s, clen);                s += clen;            }        }        else {            cr = rb_enc_asciicompat(enc) ?                ENC_CODERANGE_7BIT : ENC_CODERANGE_VALID;            while (s < e) {                int clen = rb_enc_mbclen(s, e, enc);                if (clen > 1 || (*s & 0x80)) cr = ENC_CODERANGE_UNKNOWN;                p -= clen;                memcpy(p, s, clen);                s += clen;            }        }    }    STR_SET_LEN(rev, RSTRING_LEN(str));    str_enc_copy_direct(rev, str);    ENC_CODERANGE_SET(rev, cr);    return rev;}

Returns a new string with the characters fromself in reverse order.

'drawer'.reverse# => "reward"'reviled'.reverse# => "deliver"'stressed'.reverse# => "desserts"'semordnilaps'.reverse# => "spalindromes"

Related: seeConverting to New String.

reverse! → self

Source

static VALUErb_str_reverse_bang(VALUE str){    if (RSTRING_LEN(str) > 1) {        if (single_byte_optimizable(str)) {            char *s, *e, c;            str_modify_keep_cr(str);            s = RSTRING_PTR(str);            e = RSTRING_END(str) - 1;            while (s < e) {                c = *s;                *s++ = *e;                *e-- = c;            }        }        else {            str_shared_replace(str, rb_str_reverse(str));        }    }    else {        str_modify_keep_cr(str);    }    return str;}

Returnsself with its characters reversed:

'drawer'.reverse!# => "reward"'reviled'.reverse!# => "deliver"'stressed'.reverse!# => "desserts"'semordnilaps'.reverse!# => "spalindromes"

Related: seeModifying.

rindex(pattern, offset = self.length) → integer or nil

Source

static VALUErb_str_rindex_m(int argc, VALUE *argv, VALUE str){    VALUE sub;    VALUE initpos;    rb_encoding *enc = STR_ENC_GET(str);    long pos, len = str_strlen(str, enc); /* str's enc */    if (rb_scan_args(argc, argv, "11", &sub, &initpos) == 2) {        pos = NUM2LONG(initpos);        if (pos < 0 && (pos += len) < 0) {            if (RB_TYPE_P(sub, T_REGEXP)) {                rb_backref_set(Qnil);            }            return Qnil;        }        if (pos > len) pos = len;    }    else {        pos = len;    }    if (RB_TYPE_P(sub, T_REGEXP)) {        /* enc = rb_enc_check(str, sub); */        pos = str_offset(RSTRING_PTR(str), RSTRING_END(str), pos,                         enc, single_byte_optimizable(str));        if (rb_reg_search(sub, str, pos, 1) >= 0) {            VALUE match = rb_backref_get();            struct re_registers *regs = RMATCH_REGS(match);            pos = rb_str_sublen(str, BEG(0));            return LONG2NUM(pos);        }    }    else {        StringValue(sub);        pos = rb_str_rindex(str, sub, pos);        if (pos >= 0) {            pos = rb_str_sublen(str, pos);            return LONG2NUM(pos);        }    }    return Qnil;}

Returns the integer position of thelast substring that matches the given argumentpattern, ornil if none found.

Whenpattern is a string, returns the index of the last matching substring in self:

'foo'.rindex('f')       # => 0'foo'.rindex('o')       # => 2'foo'.rindex('oo'       # => 1'foo'.rindex('ooo')     # => nil'тест'.rindex('т')      # => 3'こんにちは'.rindex('ち') # => 3

Whenpattern is aRegexp, returns the index of the last match in self:

'foo'.rindex(/f/)# => 0'foo'.rindex(/o/)# => 2'foo'.rindex(/oo/)# => 1'foo'.rindex(/ooo/)# => nil

Whenoffset is non-negative, it specifies the maximum starting position in the string to end the search:

'foo'.rindex('o',0)# => nil'foo'.rindex('o',1)# => 1'foo'.rindex('o',2)# => 2'foo'.rindex('o',3)# => 2

With negative integer argumentoffset, selects the search position by counting backward from the end ofself:

'foo'.rindex('o',-1)# => 2'foo'.rindex('o',-2)# => 1'foo'.rindex('o',-3)# => nil'foo'.rindex('o',-4)# => nil

The last match means starting at the possible last position, not the last of longest matches:

'foo'.rindex(/o+/)# => 2$~# => #<MatchData "o">

To get the last longest match, combine with negative lookbehind:

'foo'.rindex(/(?<!o)o+/)# => 1$~# => #<MatchData "oo">

OrString#index with negative lookforward.

'foo'.index(/o+(?!.*o)/)# => 1$~# => #<MatchData "oo">

Related: seeQuerying.

rjust(width, pad_string = ' ') → new_string

Source

static VALUErb_str_rjust(int argc, VALUE *argv, VALUE str){    return rb_str_justify(argc, argv, str, 'r');}

Returns a right-justified copy ofself.

If integer argumentwidth is greater than the size (in characters) ofself, returns a new string of lengthwidth that is a copy ofself, right justified and padded on the left withpad_string:

'hello'.rjust(10)# => "     hello"'hello  '.rjust(10)# => "   hello  "'hello'.rjust(10,'ab')# => "ababahello"'тест'.rjust(10)# => "      тест"'こんにちは'.rjust(10)# => "     こんにちは"

Ifwidth <= self.size, returns a copy ofself:

'hello'.rjust(5,'ab')# => "hello"'hello'.rjust(1,'ab')# => "hello"

Related: seeConverting to New String.

rpartition(pattern) → [pre_match, last_match, post_match]

Source

static VALUErb_str_rpartition(VALUE str, VALUE sep){    long pos = RSTRING_LEN(str);    sep = get_pat_quoted(sep, 0);    if (RB_TYPE_P(sep, T_REGEXP)) {        if (rb_reg_search(sep, str, pos, 1) < 0) {            goto failed;        }        VALUE match = rb_backref_get();        struct re_registers *regs = RMATCH_REGS(match);        pos = BEG(0);        sep = rb_str_subseq(str, pos, END(0) - pos);    }    else {        pos = rb_str_sublen(str, pos);        pos = rb_str_rindex(str, sep, pos);        if (pos < 0) {            goto failed;        }    }    return rb_ary_new3(3, rb_str_subseq(str, 0, pos),                          sep,                          rb_str_subseq(str, pos+RSTRING_LEN(sep),                                        RSTRING_LEN(str)-pos-RSTRING_LEN(sep)));  failed:    return rb_ary_new3(3, str_new_empty_String(str), str_new_empty_String(str), str_duplicate(rb_cString, str));}

Returns a 3-element array of substrings ofself.

Searchesself for a match ofpattern, seeking thelast match.

Ifpattern is not matched, returns the array:

["","",self.dup]

Ifpattern is matched, returns the array:

[pre_match,last_match,post_match]

where:

last_match is the last-found matching substring.
pre_match andpost_match are the preceding and following substrings.

The pattern used is:

pattern itself, if it is aRegexp.
Regexp.quote(pattern), ifpattern is a string.

Note that in the examples below, a returned string'hello' is a copy ofself, notself.

Ifpattern is aRegexp, searches for the last matching substring (also settingpattern-matching global variables):

'hello'.rpartition(/l/)# => ["hel", "l", "o"]'hello'.rpartition(/ll/)# => ["he", "ll", "o"]'hello'.rpartition(/h/)# => ["", "h", "ello"]'hello'.rpartition(/o/)# => ["hell", "o", ""]'hello'.rpartition(//)# => ["hello", "", ""]'hello'.rpartition(/x/)# => ["", "", "hello"]'тест'.rpartition(/т/)# => ["тес", "т", ""]'こんにちは'.rpartition(/に/)# => ["こん", "に", "ちは"]

Ifpattern is not aRegexp, converts it to a string (if it is not already one), then searches for the last matching substring (and doesnot setpattern-matching global variables):

'hello'.rpartition('l')# => ["hel", "l", "o"]'hello'.rpartition('ll')# => ["he", "ll", "o"]'hello'.rpartition('h')# => ["", "h", "ello"]'hello'.rpartition('o')# => ["hell", "o", ""]'hello'.rpartition('')# => ["hello", "", ""]'тест'.rpartition('т')# => ["тес", "т", ""]'こんにちは'.rpartition('に')# => ["こん", "に", "ちは"]

Related: seeConverting to Non-String.

rstrip → new_string

Source

static VALUErb_str_rstrip(VALUE str){    rb_encoding *enc;    char *start;    long olen, roffset;    enc = STR_ENC_GET(str);    RSTRING_GETMEM(str, start, olen);    roffset = rstrip_offset(str, start, start+olen, enc);    if (roffset <= 0) return str_duplicate(rb_cString, str);    return rb_str_subseq(str, 0, olen-roffset);}

Returns a copy ofself with trailing whitespace removed; seeWhitespace in Strings:

whitespace ="\x00\t\n\v\f\r "s =whitespace+'abc'+whitespaces# => "\u0000\t\n\v\f\r abc\u0000\t\n\v\f\r "s.rstrip# => "\u0000\t\n\v\f\r abc"

Related: seeConverting to New String.

rstrip! → self or nil

Source

static VALUErb_str_rstrip_bang(VALUE str){    rb_encoding *enc;    char *start;    long olen, roffset;    str_modify_keep_cr(str);    enc = STR_ENC_GET(str);    RSTRING_GETMEM(str, start, olen);    roffset = rstrip_offset(str, start, start+olen, enc);    if (roffset > 0) {        long len = olen - roffset;        STR_SET_LEN(str, len);        TERM_FILL(start+len, rb_enc_mbminlen(enc));        return str;    }    return Qnil;}

LikeString#rstrip, except that:

Performs stripping inself (not in a copy ofself).
Returnsself if any characters are stripped,nil otherwise.

Related: seeModifying.

scan(pattern) → array_of_results

scan(pattern) {|result| ... } → self

Source

static VALUErb_str_scan(VALUE str, VALUE pat){    VALUE result;    long start = 0;    long last = -1, prev = 0;    char *p = RSTRING_PTR(str); long len = RSTRING_LEN(str);    pat = get_pat_quoted(pat, 1);    mustnot_broken(str);    if (!rb_block_given_p()) {        VALUE ary = rb_ary_new();        while (!NIL_P(result = scan_once(str, pat, &start, 0))) {            last = prev;            prev = start;            rb_ary_push(ary, result);        }        if (last >= 0) rb_pat_search(pat, str, last, 1);        else rb_backref_set(Qnil);        return ary;    }    while (!NIL_P(result = scan_once(str, pat, &start, 1))) {        last = prev;        prev = start;        rb_yield(result);        str_mod_check(str, p, len);    }    if (last >= 0) rb_pat_search(pat, str, last, 1);    return str;}

Matches a pattern againstself:

Ifpattern is aRegexp, the pattern used ispattern itself.
Ifpattern is a string, the pattern used isRegexp.quote(pattern).

Generates a collection of matching results and updatesregexp-related global variables:

If the pattern contains no groups, each result is a matched substring.
If the pattern contains groups, each result is an array containing a matched substring for each group.

With no block given, returns an array of the results:

'cruel world'.scan(/\w+/)# => ["cruel", "world"]'cruel world'.scan(/.../)# => ["cru", "el ", "wor"]'cruel world'.scan(/(...)/)# => [["cru"], ["el "], ["wor"]]'cruel world'.scan(/(..)(..)/)# => [["cr", "ue"], ["l ", "wo"]]'тест'.scan(/../)# => ["те", "ст"]'こんにちは'.scan(/../)# => ["こん", "にち"]'abracadabra'.scan('ab')# => ["ab", "ab"]'abracadabra'.scan('nosuch')# => []

With a block given, calls the block with each result; returnsself:

'cruel world'.scan(/\w+/) {|w|pw }# => "cruel"# => "world"'cruel world'.scan(/(.)(.)/) {|x,y|p [x,y] }# => ["c", "r"]# => ["u", "e"]# => ["l", " "]# => ["w", "o"]# => ["r", "l"]

Related: seeConverting to Non-String.

scrub(replacement_string = default_replacement_string) → new_string

scrub{|sequence| ... } → new_string

Source

static VALUEstr_scrub(int argc, VALUE *argv, VALUE str){    VALUE repl = argc ? (rb_check_arity(argc, 0, 1), argv[0]) : Qnil;    VALUE new = rb_str_scrub(str, repl);    return NIL_P(new) ? str_duplicate(rb_cString, str): new;}

Returns a copy ofself with each invalid byte sequence replaced by the givenreplacement_string.

With no block given, replaces each invalid sequence with the givendefault_replacement_string (by default,"�" for a Unicode encoding,'?' otherwise):

"foo\x81\x81bar"scrub                             # => "foo��bar""foo\x81\x81bar".force_encoding('US-ASCII').scrub # => "foo??bar""foo\x81\x81bar".scrub('xyzzy')                   # => "fooxyzzyxyzzybar"

With a block given, calls the block with each invalid sequence, and replaces that sequence with the return value of the block:

"foo\x81\x81bar".scrub {|sequence|psequence;'XYZZY' }# => "fooXYZZYXYZZYbar"

Output :

"\x81""\x81"

Related: seeConverting to New String.

scrub!(replacement_string = default_replacement_string) → self

scrub!{|sequence| ... } → self

Source

static VALUEstr_scrub_bang(int argc, VALUE *argv, VALUE str){    VALUE repl = argc ? (rb_check_arity(argc, 0, 1), argv[0]) : Qnil;    VALUE new = rb_str_scrub(str, repl);    if (!NIL_P(new)) rb_str_replace(str, new);    return str;}

LikeString#scrub, except that:

Any replacements are made inself.
Returnsself.

Related: seeModifying.

setbyte(index, integer) → integer

Source

VALUErb_str_setbyte(VALUE str, VALUE index, VALUE value){    long pos = NUM2LONG(index);    long len = RSTRING_LEN(str);    char *ptr, *head, *left = 0;    rb_encoding *enc;    int cr = ENC_CODERANGE_UNKNOWN, width, nlen;    if (pos < -len || len <= pos)        rb_raise(rb_eIndexError, "index %ld out of string", pos);    if (pos < 0)        pos += len;    VALUE v = rb_to_int(value);    VALUE w = rb_int_and(v, INT2FIX(0xff));    char byte = (char)(NUM2INT(w) & 0xFF);    if (!str_independent(str))        str_make_independent(str);    enc = STR_ENC_GET(str);    head = RSTRING_PTR(str);    ptr = &head[pos];    if (!STR_EMBED_P(str)) {        cr = ENC_CODERANGE(str);        switch (cr) {          case ENC_CODERANGE_7BIT:            left = ptr;            *ptr = byte;            if (ISASCII(byte)) goto end;            nlen = rb_enc_precise_mbclen(left, head+len, enc);            if (!MBCLEN_CHARFOUND_P(nlen))                ENC_CODERANGE_SET(str, ENC_CODERANGE_BROKEN);            else                ENC_CODERANGE_SET(str, ENC_CODERANGE_VALID);            goto end;          case ENC_CODERANGE_VALID:            left = rb_enc_left_char_head(head, ptr, head+len, enc);            width = rb_enc_precise_mbclen(left, head+len, enc);            *ptr = byte;            nlen = rb_enc_precise_mbclen(left, head+len, enc);            if (!MBCLEN_CHARFOUND_P(nlen))                ENC_CODERANGE_SET(str, ENC_CODERANGE_BROKEN);            else if (MBCLEN_CHARFOUND_LEN(nlen) != width || ISASCII(byte))                ENC_CODERANGE_CLEAR(str);            goto end;        }    }    ENC_CODERANGE_CLEAR(str);    *ptr = byte;  end:    return value;}

Sets the byte at zero-based offsetindex to the value of the giveninteger; returnsinteger:

s ='xyzzy's.setbyte(2,129)# => 129s# => "xy\x81zy"

Related: seeModifying.

shellescape → string

Source

# File lib/shellwords.rb, line 238defshellescapeShellwords.escape(self)end

Escapesstr so that it can be safely used in a Bourne shell command line.

SeeShellwords.shellescape for details.

shellsplit → array

Source

# File lib/shellwords.rb, line 227defshellsplitShellwords.split(self)end

Splitsstr into an array of tokens in the same way the UNIX Bourne shell does.

SeeShellwords.shellsplit for details.

size

Alias for:length

slice

Alias for:[]

slice!(index) → new_string or nil

slice!(start, length) → new_string or nil

slice!(range) → new_string or nil

slice!(regexp, capture = 0) → new_string or nil

slice!(substring) → new_string or nil

Source

static VALUErb_str_slice_bang(int argc, VALUE *argv, VALUE str){    VALUE result = Qnil;    VALUE indx;    long beg, len = 1;    char *p;    rb_check_arity(argc, 1, 2);    str_modify_keep_cr(str);    indx = argv[0];    if (RB_TYPE_P(indx, T_REGEXP)) {        if (rb_reg_search(indx, str, 0, 0) < 0) return Qnil;        VALUE match = rb_backref_get();        struct re_registers *regs = RMATCH_REGS(match);        int nth = 0;        if (argc > 1 && (nth = rb_reg_backref_number(match, argv[1])) < 0) {            if ((nth += regs->num_regs) <= 0) return Qnil;        }        else if (nth >= regs->num_regs) return Qnil;        beg = BEG(nth);        len = END(nth) - beg;        goto subseq;    }    else if (argc == 2) {        beg = NUM2LONG(indx);        len = NUM2LONG(argv[1]);        goto num_index;    }    else if (FIXNUM_P(indx)) {        beg = FIX2LONG(indx);        if (!(p = rb_str_subpos(str, beg, &len))) return Qnil;        if (!len) return Qnil;        beg = p - RSTRING_PTR(str);        goto subseq;    }    else if (RB_TYPE_P(indx, T_STRING)) {        beg = rb_str_index(str, indx, 0);        if (beg == -1) return Qnil;        len = RSTRING_LEN(indx);        result = str_duplicate(rb_cString, indx);        goto squash;    }    else {        switch (rb_range_beg_len(indx, &beg, &len, str_strlen(str, NULL), 0)) {          case Qnil:            return Qnil;          case Qfalse:            beg = NUM2LONG(indx);            if (!(p = rb_str_subpos(str, beg, &len))) return Qnil;            if (!len) return Qnil;            beg = p - RSTRING_PTR(str);            goto subseq;          default:            goto num_index;        }    }  num_index:    if (!(p = rb_str_subpos(str, beg, &len))) return Qnil;    beg = p - RSTRING_PTR(str);  subseq:    result = rb_str_new(RSTRING_PTR(str)+beg, len);    rb_enc_cr_str_copy_for_substr(result, str);  squash:    if (len > 0) {        if (beg == 0) {            rb_str_drop_bytes(str, len);        }        else {            char *sptr = RSTRING_PTR(str);            long slen = RSTRING_LEN(str);            if (beg + len > slen) /* pathological check */                len = slen - beg;            memmove(sptr + beg,                    sptr + beg + len,                    slen - (beg + len));            slen -= len;            STR_SET_LEN(str, slen);            TERM_FILL(&sptr[slen], TERM_LEN(str));        }    }    return result;}

LikeString#[] (and its aliasString#slice), except that:

Performs substitutions inself (not in a copy ofself).
Returns the removed substring if any modifications were made,nil otherwise.

A few examples:

s ='hello's.slice!('e')# => "e"s# => "hllo"s.slice!('e')# => nils# => "hllo"

Related: seeModifying.

split(field_sep = $;, limit = 0) → array_of_substrings

split(field_sep = $;, limit = 0) {|substring| ... } → self

Source

static VALUErb_str_split_m(int argc, VALUE *argv, VALUE str){    rb_encoding *enc;    VALUE spat;    VALUE limit;    split_type_t split_type;    long beg, end, i = 0, empty_count = -1;    int lim = 0;    VALUE result, tmp;    result = rb_block_given_p() ? Qfalse : Qnil;    if (rb_scan_args(argc, argv, "02", &spat, &limit) == 2) {        lim = NUM2INT(limit);        if (lim <= 0) limit = Qnil;        else if (lim == 1) {            if (RSTRING_LEN(str) == 0)                return result ? rb_ary_new2(0) : str;            tmp = str_duplicate(rb_cString, str);            if (!result) {                rb_yield(tmp);                return str;            }            return rb_ary_new3(1, tmp);        }        i = 1;    }    if (NIL_P(limit) && !lim) empty_count = 0;    enc = STR_ENC_GET(str);    split_type = SPLIT_TYPE_REGEXP;    if (!NIL_P(spat)) {        spat = get_pat_quoted(spat, 0);    }    else if (NIL_P(spat = rb_fs)) {        split_type = SPLIT_TYPE_AWK;    }    else if (!(spat = rb_fs_check(spat))) {        rb_raise(rb_eTypeError, "value of $; must be String or Regexp");    }    else {        rb_category_warn(RB_WARN_CATEGORY_DEPRECATED, "$; is set to non-nil value");    }    if (split_type != SPLIT_TYPE_AWK) {        switch (BUILTIN_TYPE(spat)) {          case T_REGEXP:            rb_reg_options(spat); /* check if uninitialized */            tmp = RREGEXP_SRC(spat);            split_type = literal_split_pattern(tmp, SPLIT_TYPE_REGEXP);            if (split_type == SPLIT_TYPE_AWK) {                spat = tmp;                split_type = SPLIT_TYPE_STRING;            }            break;          case T_STRING:            mustnot_broken(spat);            split_type = literal_split_pattern(spat, SPLIT_TYPE_STRING);            break;          default:            UNREACHABLE_RETURN(Qnil);        }    }#define SPLIT_STR(beg, len) ( \        empty_count = split_string(result, str, beg, len, empty_count), \        str_mod_check(str, str_start, str_len))    beg = 0;    char *ptr = RSTRING_PTR(str);    char *const str_start = ptr;    const long str_len = RSTRING_LEN(str);    char *const eptr = str_start + str_len;    if (split_type == SPLIT_TYPE_AWK) {        char *bptr = ptr;        int skip = 1;        unsigned int c;        if (result) result = rb_ary_new();        end = beg;        if (is_ascii_string(str)) {            while (ptr < eptr) {                c = (unsigned char)*ptr++;                if (skip) {                    if (ascii_isspace(c)) {                        beg = ptr - bptr;                    }                    else {                        end = ptr - bptr;                        skip = 0;                        if (!NIL_P(limit) && lim <= i) break;                    }                }                else if (ascii_isspace(c)) {                    SPLIT_STR(beg, end-beg);                    skip = 1;                    beg = ptr - bptr;                    if (!NIL_P(limit)) ++i;                }                else {                    end = ptr - bptr;                }            }        }        else {            while (ptr < eptr) {                int n;                c = rb_enc_codepoint_len(ptr, eptr, &n, enc);                ptr += n;                if (skip) {                    if (rb_isspace(c)) {                        beg = ptr - bptr;                    }                    else {                        end = ptr - bptr;                        skip = 0;                        if (!NIL_P(limit) && lim <= i) break;                    }                }                else if (rb_isspace(c)) {                    SPLIT_STR(beg, end-beg);                    skip = 1;                    beg = ptr - bptr;                    if (!NIL_P(limit)) ++i;                }                else {                    end = ptr - bptr;                }            }        }    }    else if (split_type == SPLIT_TYPE_STRING) {        char *substr_start = ptr;        char *sptr = RSTRING_PTR(spat);        long slen = RSTRING_LEN(spat);        if (result) result = rb_ary_new();        mustnot_broken(str);        enc = rb_enc_check(str, spat);        while (ptr < eptr &&               (end = rb_memsearch(sptr, slen, ptr, eptr - ptr, enc)) >= 0) {            /* Check we are at the start of a char */            char *t = rb_enc_right_char_head(ptr, ptr + end, eptr, enc);            if (t != ptr + end) {                ptr = t;                continue;            }            SPLIT_STR(substr_start - str_start, (ptr+end) - substr_start);            str_mod_check(spat, sptr, slen);            ptr += end + slen;            substr_start = ptr;            if (!NIL_P(limit) && lim <= ++i) break;        }        beg = ptr - str_start;    }    else if (split_type == SPLIT_TYPE_CHARS) {        int n;        if (result) result = rb_ary_new_capa(RSTRING_LEN(str));        mustnot_broken(str);        enc = rb_enc_get(str);        while (ptr < eptr &&               (n = rb_enc_precise_mbclen(ptr, eptr, enc)) > 0) {            SPLIT_STR(ptr - str_start, n);            ptr += n;            if (!NIL_P(limit) && lim <= ++i) break;        }        beg = ptr - str_start;    }    else {        if (result) result = rb_ary_new();        long len = RSTRING_LEN(str);        long start = beg;        long idx;        int last_null = 0;        struct re_registers *regs;        VALUE match = 0;        for (; rb_reg_search(spat, str, start, 0) >= 0;             (match ? (rb_match_unbusy(match), rb_backref_set(match)) : (void)0)) {            match = rb_backref_get();            if (!result) rb_match_busy(match);            regs = RMATCH_REGS(match);            end = BEG(0);            if (start == end && BEG(0) == END(0)) {                if (!ptr) {                    SPLIT_STR(0, 0);                    break;                }                else if (last_null == 1) {                    SPLIT_STR(beg, rb_enc_fast_mbclen(ptr+beg, eptr, enc));                    beg = start;                }                else {                    if (start == len)                        start++;                    else                        start += rb_enc_fast_mbclen(ptr+start,eptr,enc);                    last_null = 1;                    continue;                }            }            else {                SPLIT_STR(beg, end-beg);                beg = start = END(0);            }            last_null = 0;            for (idx=1; idx < regs->num_regs; idx++) {                if (BEG(idx) == -1) continue;                SPLIT_STR(BEG(idx), END(idx)-BEG(idx));            }            if (!NIL_P(limit) && lim <= ++i) break;        }        if (match) rb_match_unbusy(match);    }    if (RSTRING_LEN(str) > 0 && (!NIL_P(limit) || RSTRING_LEN(str) > beg || lim < 0)) {        SPLIT_STR(beg, RSTRING_LEN(str)-beg);    }    return result ? result : str;}

Creates an array of substrings by splittingself at each occurrence of the given field separatorfield_sep.

With no arguments given, splits using the field separator$;, whose default value isnil.

With no block given, returns the array of substrings:

'abracadabra'.split('a')# => ["", "br", "c", "d", "br"]

Whenfield_sep isnil or' ' (a single space), splits at each sequence of whitespace:

'foo bar baz'.split(nil)# => ["foo", "bar", "baz"]'foo bar baz'.split(' ')# => ["foo", "bar", "baz"]"foo \n\tbar\t\n  baz".split(' ')# => ["foo", "bar", "baz"]'foo  bar   baz'.split(' ')# => ["foo", "bar", "baz"]''.split(' ')# => []

Whenfield_sep is an empty string, splits at every character:

'abracadabra'.split('')# => ["a", "b", "r", "a", "c", "a", "d", "a", "b", "r", "a"]''.split('')# => []'тест'.split('')# => ["т", "е", "с", "т"]'こんにちは'.split('')# => ["こ", "ん", "に", "ち", "は"]

Whenfield_sep is a non-empty string and different from' ' (a single space), uses that string as the separator:

'abracadabra'.split('a')# => ["", "br", "c", "d", "br"]'abracadabra'.split('ab')# => ["", "racad", "ra"]''.split('a')# => []'тест'.split('т')# => ["", "ес"]'こんにちは'.split('に')# => ["こん", "ちは"]

Whenfield_sep is aRegexp, splits at each occurrence of a matching substring:

'abracadabra'.split(/ab/)# => ["", "racad", "ra"]'1 + 1 == 2'.split(/\W+/)# => ["1", "1", "2"]'abracadabra'.split(//)# => ["a", "b", "r", "a", "c", "a", "d", "a", "b", "r", "a"]

If the Regexp contains groups, their matches are included in the returned array:

'1:2:3'.split(/(:)()()/,2)# => ["1", ":", "", "", "2:3"]

Argumentlimit sets a limit on the size of the returned array; it also determines whether trailing empty strings are included in the returned array.

Whenlimit is zero, there is no limit on the size of the array, but trailing empty strings are omitted:

'abracadabra'.split('',0)# => ["a", "b", "r", "a", "c", "a", "d", "a", "b", "r", "a"]'abracadabra'.split('a',0)# => ["", "br", "c", "d", "br"]  # Empty string after last 'a' omitted.

Whenlimit is a positive integer, there is a limit on the size of the array (no more thann - 1 splits occur), and trailing empty strings are included:

'abracadabra'.split('',3)# => ["a", "b", "racadabra"]'abracadabra'.split('a',3)# => ["", "br", "cadabra"]'abracadabra'.split('',30)# => ["a", "b", "r", "a", "c", "a", "d", "a", "b", "r", "a", ""]'abracadabra'.split('a',30)# => ["", "br", "c", "d", "br", ""]'abracadabra'.split('',1)# => ["abracadabra"]'abracadabra'.split('a',1)# => ["abracadabra"]

Whenlimit is negative, there is no limit on the size of the array, and trailing empty strings are omitted:

'abracadabra'.split('',-1)# => ["a", "b", "r", "a", "c", "a", "d", "a", "b", "r", "a", ""]'abracadabra'.split('a',-1)# => ["", "br", "c", "d", "br", ""]

If a block is given, it is called with each substring and returnsself:

'foo bar baz'.split(' ') {|substring|psubstring }

Output :

"foo""bar""baz"

Note that the above example is functionally equivalent to:

'foo bar baz'.split(' ').each {|substring|psubstring }

Output :

"foo""bar""baz"

But the latter:

Has poorer performance because it creates an intermediate array.
Returns an array (instead ofself).

Related: seeConverting to Non-String.

squeeze(*selectors) → new_string

Source

static VALUErb_str_squeeze(int argc, VALUE *argv, VALUE str){    str = str_duplicate(rb_cString, str);    rb_str_squeeze_bang(argc, argv, str);    return str;}

Returns a copy ofself with each tuple (doubling, tripling, etc.) of specified characters “squeezed” down to a single character.

The tuples to be squeezed are specified by argumentsselectors, each of which is a string; seeCharacter Selectors.

A single argument may be a single character:

'Noooooo!'.squeeze('o')# => "No!"'foo  bar  baz'.squeeze(' ')# => "foo bar baz"'Mississippi'.squeeze('s')# => "Misisippi"'Mississippi'.squeeze('p')# => "Mississipi"'Mississippi'.squeeze('x')# => "Mississippi"  # Unused selector character is ignored.'бессонница'.squeeze('с')# => "бесонница"'бессонница'.squeeze('н')# => "бессоница"

A single argument may be a string of characters:

'Mississippi'.squeeze('sp')# => "Misisipi"'Mississippi'.squeeze('ps')# => "Misisipi"   # Order doesn't matter.'Mississippi'.squeeze('nonsense')# => "Misisippi"  # Unused selector characters are ignored.

A single argument may be a range of characters:

'Mississippi'.squeeze('a-p')# => "Mississipi"'Mississippi'.squeeze('q-z')# => "Misisippi"'Mississippi'.squeeze('a-z')# => "Misisipi"

Multiple arguments are allowed; seeMultiple Character Selectors.

Related: seeConverting to New String.

squeeze!(*selectors) → self or nil

Source

static VALUErb_str_squeeze_bang(int argc, VALUE *argv, VALUE str){    char squeez[TR_TABLE_SIZE];    rb_encoding *enc = 0;    VALUE del = 0, nodel = 0;    unsigned char *s, *send, *t;    int i, modify = 0;    int ascompat, singlebyte = single_byte_optimizable(str);    unsigned int save;    if (argc == 0) {        enc = STR_ENC_GET(str);    }    else {        for (i=0; i<argc; i++) {            VALUE s = argv[i];            StringValue(s);            enc = rb_enc_check(str, s);            if (singlebyte && !single_byte_optimizable(s))                singlebyte = 0;            tr_setup_table(s, squeez, i==0, &del, &nodel, enc);        }    }    str_modify_keep_cr(str);    s = t = (unsigned char *)RSTRING_PTR(str);    if (!s || RSTRING_LEN(str) == 0) return Qnil;    send = (unsigned char *)RSTRING_END(str);    save = -1;    ascompat = rb_enc_asciicompat(enc);    if (singlebyte) {        while (s < send) {            unsigned int c = *s++;            if (c != save || (argc > 0 && !squeez[c])) {                *t++ = save = c;            }        }    }    else {        while (s < send) {            unsigned int c;            int clen;            if (ascompat && (c = *s) < 0x80) {                if (c != save || (argc > 0 && !squeez[c])) {                    *t++ = save = c;                }                s++;            }            else {                c = rb_enc_codepoint_len((char *)s, (char *)send, &clen, enc);                if (c != save || (argc > 0 && !tr_find(c, squeez, del, nodel))) {                    if (t != s) rb_enc_mbcput(c, t, enc);                    save = c;                    t += clen;                }                s += clen;            }        }    }    TERM_FILL((char *)t, TERM_LEN(str));    if ((char *)t - RSTRING_PTR(str) != RSTRING_LEN(str)) {        STR_SET_LEN(str, (char *)t - RSTRING_PTR(str));        modify = 1;    }    if (modify) return str;    return Qnil;}

LikeString#squeeze, except that:

Characters are squeezed inself (not in a copy ofself).
Returnsself if any changes are made,nil otherwise.

Related: SeeModifying.

start_with?(*patterns) → true or false

Source

static VALUErb_str_start_with(int argc, VALUE *argv, VALUE str){    int i;    for (i=0; i<argc; i++) {        VALUE tmp = argv[i];        if (RB_TYPE_P(tmp, T_REGEXP)) {            if (rb_reg_start_with_p(tmp, str))                return Qtrue;        }        else {            const char *p, *s, *e;            long slen, tlen;            rb_encoding *enc;            StringValue(tmp);            enc = rb_enc_check(str, tmp);            if ((tlen = RSTRING_LEN(tmp)) == 0) return Qtrue;            if ((slen = RSTRING_LEN(str)) < tlen) continue;            p = RSTRING_PTR(str);            e = p + slen;            s = p + tlen;            if (!at_char_right_boundary(p, s, e, enc))                continue;            if (memcmp(p, RSTRING_PTR(tmp), tlen) == 0)                return Qtrue;        }    }    return Qfalse;}

Returns whetherself starts with any of the givenpatterns.

For each argument, the pattern used is:

The pattern itself, if it is aRegexp.
Regexp.quote(pattern), if it is a string.

Returnstrue if any pattern matches the beginning,false otherwise:

'hello'.start_with?('hell')# => true'hello'.start_with?(/H/i)# => true'hello'.start_with?('heaven','hell')# => true'hello'.start_with?('heaven','paradise')# => false'тест'.start_with?('т')# => true'こんにちは'.start_with?('こ')# => true

Related: seeQuerying.

strip → new_string

Source

static VALUErb_str_strip(VALUE str){    char *start;    long olen, loffset, roffset;    rb_encoding *enc = STR_ENC_GET(str);    RSTRING_GETMEM(str, start, olen);    loffset = lstrip_offset(str, start, start+olen, enc);    roffset = rstrip_offset(str, start+loffset, start+olen, enc);    if (loffset <= 0 && roffset <= 0) return str_duplicate(rb_cString, str);    return rb_str_subseq(str, loffset, olen-loffset-roffset);}

Returns a copy ofself with leading and trailing whitespace removed; seeWhitespace in Strings:

whitespace ="\x00\t\n\v\f\r "s =whitespace+'abc'+whitespace# => "\u0000\t\n\v\f\r abc\u0000\t\n\v\f\r "s.strip# => "abc"

Related: seeConverting to New String.

strip! → self or nil

Source

static VALUErb_str_strip_bang(VALUE str){    char *start;    long olen, loffset, roffset;    rb_encoding *enc;    str_modify_keep_cr(str);    enc = STR_ENC_GET(str);    RSTRING_GETMEM(str, start, olen);    loffset = lstrip_offset(str, start, start+olen, enc);    roffset = rstrip_offset(str, start+loffset, start+olen, enc);    if (loffset > 0 || roffset > 0) {        long len = olen-roffset;        if (loffset > 0) {            len -= loffset;            memmove(start, start + loffset, len);        }        STR_SET_LEN(str, len);        TERM_FILL(start+len, rb_enc_mbminlen(enc));        return str;    }    return Qnil;}

LikeString#strip, except that:

Any modifications are made toself.
Returnsself if any modification are made,nil otherwise.

Related: seeModifying.

sub(pattern, replacement) → new_string

sub(pattern) {|match| ... } → new_string

Source

static VALUErb_str_sub(int argc, VALUE *argv, VALUE str){    str = str_duplicate(rb_cString, str);    rb_str_sub_bang(argc, argv, str);    return str;}

Returns a copy of self, possibly with a substring replaced.

Argumentpattern may be a string or aRegexp; argumentreplacement may be a string or aHash.

Varying types for the argument values makes this method very versatile.

Below are some simple examples; for many more examples, seeSubstitution Methods.

With argumentspattern and stringreplacement given, replaces the first matching substring with the given replacement string:

s ='abracadabra'# => "abracadabra"s.sub('bra','xyzzy')# => "axyzzycadabra"s.sub(/bra/,'xyzzy')# => "axyzzycadabra"s.sub('nope','xyzzy')# => "abracadabra"

With argumentspattern and hashreplacement given, replaces the first matching substring with a value from the given replacement hash, or removes it:

h = {'a'=>'A','b'=>'B','c'=>'C'}s.sub('b',h)# => "aBracadabra"s.sub(/b/,h)# => "aBracadabra"s.sub(/d/,h)# => "abracaabra"  # 'd' removed.

With argumentpattern and a block given, calls the block with each matching substring; replaces that substring with the block’s return value:

s.sub('b') {|match|match.upcase }# => "aBracadabra"

Related: seeConverting to New String.

sub!(pattern, replacement) → self or nil

sub!(pattern) {|match| ... } → self or nil

Source

static VALUErb_str_sub_bang(int argc, VALUE *argv, VALUE str){    VALUE pat, repl, hash = Qnil;    int iter = 0;    long plen;    int min_arity = rb_block_given_p() ? 1 : 2;    long beg;    rb_check_arity(argc, min_arity, 2);    if (argc == 1) {        iter = 1;    }    else {        repl = argv[1];        hash = rb_check_hash_type(argv[1]);        if (NIL_P(hash)) {            StringValue(repl);        }    }    pat = get_pat_quoted(argv[0], 1);    str_modifiable(str);    beg = rb_pat_search(pat, str, 0, 1);    if (beg >= 0) {        rb_encoding *enc;        int cr = ENC_CODERANGE(str);        long beg0, end0;        VALUE match, match0 = Qnil;        struct re_registers *regs;        char *p, *rp;        long len, rlen;        match = rb_backref_get();        regs = RMATCH_REGS(match);        if (RB_TYPE_P(pat, T_STRING)) {            beg0 = beg;            end0 = beg0 + RSTRING_LEN(pat);            match0 = pat;        }        else {            beg0 = BEG(0);            end0 = END(0);            if (iter) match0 = rb_reg_nth_match(0, match);        }        if (iter || !NIL_P(hash)) {            p = RSTRING_PTR(str); len = RSTRING_LEN(str);            if (iter) {                repl = rb_obj_as_string(rb_yield(match0));            }            else {                repl = rb_hash_aref(hash, rb_str_subseq(str, beg0, end0 - beg0));                repl = rb_obj_as_string(repl);            }            str_mod_check(str, p, len);            rb_check_frozen(str);        }        else {            repl = rb_reg_regsub(repl, str, regs, RB_TYPE_P(pat, T_STRING) ? Qnil : pat);        }        enc = rb_enc_compatible(str, repl);        if (!enc) {            rb_encoding *str_enc = STR_ENC_GET(str);            p = RSTRING_PTR(str); len = RSTRING_LEN(str);            if (coderange_scan(p, beg0, str_enc) != ENC_CODERANGE_7BIT ||                coderange_scan(p+end0, len-end0, str_enc) != ENC_CODERANGE_7BIT) {                rb_raise(rb_eEncCompatError, "incompatible character encodings: %s and %s",                         rb_enc_inspect_name(str_enc),                         rb_enc_inspect_name(STR_ENC_GET(repl)));            }            enc = STR_ENC_GET(repl);        }        rb_str_modify(str);        rb_enc_associate(str, enc);        if (ENC_CODERANGE_UNKNOWN < cr && cr < ENC_CODERANGE_BROKEN) {            int cr2 = ENC_CODERANGE(repl);            if (cr2 == ENC_CODERANGE_BROKEN ||                (cr == ENC_CODERANGE_VALID && cr2 == ENC_CODERANGE_7BIT))                cr = ENC_CODERANGE_UNKNOWN;            else                cr = cr2;        }        plen = end0 - beg0;        rlen = RSTRING_LEN(repl);        len = RSTRING_LEN(str);        if (rlen > plen) {            RESIZE_CAPA(str, len + rlen - plen);        }        p = RSTRING_PTR(str);        if (rlen != plen) {            memmove(p + beg0 + rlen, p + beg0 + plen, len - beg0 - plen);        }        rp = RSTRING_PTR(repl);        memmove(p + beg0, rp, rlen);        len += rlen - plen;        STR_SET_LEN(str, len);        TERM_FILL(&RSTRING_PTR(str)[len], TERM_LEN(str));        ENC_CODERANGE_SET(str, cr);        RB_GC_GUARD(match);        return str;    }    return Qnil;}

LikeString#sub, except that:

Changes are made toself, not to copy ofself.
Returnsself if any changes are made,nil otherwise.

Related: seeModifying.

succ → new_str

Source

VALUErb_str_succ(VALUE orig){    VALUE str;    str = rb_str_new(RSTRING_PTR(orig), RSTRING_LEN(orig));    rb_enc_cr_str_copy_for_substr(str, orig);    return str_succ(str);}

Returns the successor toself. The successor is calculated by incrementing characters.

The first character to be incremented is the rightmost alphanumeric: or, if no alphanumerics, the rightmost character:

'THX1138'.succ# => "THX1139"'<<koala>>'.succ# => "<<koalb>>"'***'.succ# => '**+''тест'.succ# => "тесу"'こんにちは'.succ# => "こんにちば"

The successor to a digit is another digit, “carrying” to the next-left character for a “rollover” from 9 to 0, and prepending another digit if necessary:

'00'.succ# => "01"'09'.succ# => "10"'99'.succ# => "100"

The successor to a letter is another letter of the same case, carrying to the next-left character for a rollover, and prepending another same-case letter if necessary:

'aa'.succ# => "ab"'az'.succ# => "ba"'zz'.succ# => "aaa"'AA'.succ# => "AB"'AZ'.succ# => "BA"'ZZ'.succ# => "AAA"

The successor to a non-alphanumeric character is the next character in the underlying character set’s collating sequence, carrying to the next-left character for a rollover, and prepending another character if necessary:

s =0.chr*3# => "\x00\x00\x00"s.succ# => "\x00\x00\x01"s =255.chr*3# => "\xFF\xFF\xFF"s.succ# => "\x01\x00\x00\x00"

Carrying can occur between and among mixtures of alphanumeric characters:

s ='zz99zz99'# => "zz99zz99"s.succ# => "aaa00aa00"s ='99zz99zz'# => "99zz99zz"s.succ# => "100aa00aa"

The successor to an emptyString is a new emptyString:

''.succ# => ""

Related: seeConverting to New String.

Also aliased as:next

succ! → self

Source

static VALUErb_str_succ_bang(VALUE str){    rb_str_modify(str);    str_succ(str);    return str;}

LikeString#succ, but modifiesself in place; returnsself.

Related: seeModifying.

Also aliased as:next!

sum(n = 16) → integer

Source

static VALUErb_str_sum(int argc, VALUE *argv, VALUE str){    int bits = 16;    char *ptr, *p, *pend;    long len;    VALUE sum = INT2FIX(0);    unsigned long sum0 = 0;    if (rb_check_arity(argc, 0, 1) && (bits = NUM2INT(argv[0])) < 0) {        bits = 0;    }    ptr = p = RSTRING_PTR(str);    len = RSTRING_LEN(str);    pend = p + len;    while (p < pend) {        if (FIXNUM_MAX - UCHAR_MAX < sum0) {            sum = rb_funcall(sum, '+', 1, LONG2FIX(sum0));            str_mod_check(str, ptr, len);            sum0 = 0;        }        sum0 += (unsigned char)*p;        p++;    }    if (bits == 0) {        if (sum0) {            sum = rb_funcall(sum, '+', 1, LONG2FIX(sum0));        }    }    else {        if (sum == INT2FIX(0)) {            if (bits < (int)sizeof(long)*CHAR_BIT) {                sum0 &= (((unsigned long)1)<<bits)-1;            }            sum = LONG2FIX(sum0);        }        else {            VALUE mod;            if (sum0) {                sum = rb_funcall(sum, '+', 1, LONG2FIX(sum0));            }            mod = rb_funcall(INT2FIX(1), idLTLT, 1, INT2FIX(bits));            mod = rb_funcall(mod, '-', 1, INT2FIX(1));            sum = rb_funcall(sum, '&', 1, mod);        }    }    return sum;}

Returns a basicn-bitchecksum of the characters inself; the checksum is the sum of the binary value of each byte inself, modulo2**n - 1:

'hello'.sum# => 532'hello'.sum(4)# => 4'hello'.sum(64)# => 532'тест'.sum# => 1405'こんにちは'.sum# => 2582

This is not a particularly strong checksum.

Related: seeQuerying.

swapcase(mapping) → new_string

Source

static VALUErb_str_swapcase(int argc, VALUE *argv, VALUE str){    rb_encoding *enc;    OnigCaseFoldType flags = ONIGENC_CASE_UPCASE | ONIGENC_CASE_DOWNCASE;    VALUE ret;    flags = check_case_options(argc, argv, flags);    enc = str_true_enc(str);    if (RSTRING_LEN(str) == 0 || !RSTRING_PTR(str)) return str_duplicate(rb_cString, str);    if (flags&ONIGENC_CASE_ASCII_ONLY) {        ret = rb_str_new(0, RSTRING_LEN(str));        rb_str_ascii_casemap(str, ret, &flags, enc);    }    else {        ret = rb_str_casemap(str, &flags, enc);    }    return ret;}

Returns a string containing the characters inself, with cases reversed:

Each uppercase character is downcased.
Each lowercase character is upcased.

Examples:

'Hello World!'.swapcase# => "hELLO wORLD!"'тест'.swapcase# => "ТЕСТ"

Some characters (and even character sets) do not have casing:

'12345'.swapcase# => "12345"'こんにちは'.swapcase# => "こんにちは"

The casing may be affected by the givenmapping; seeCase Mapping.

Related: seeConverting to New String.

swapcase!(mapping) → self or nil

Source

static VALUErb_str_swapcase_bang(int argc, VALUE *argv, VALUE str){    rb_encoding *enc;    OnigCaseFoldType flags = ONIGENC_CASE_UPCASE | ONIGENC_CASE_DOWNCASE;    flags = check_case_options(argc, argv, flags);    str_modify_keep_cr(str);    enc = str_true_enc(str);    if (flags&ONIGENC_CASE_ASCII_ONLY)        rb_str_ascii_casemap(str, str, &flags, enc);    else        str_shared_replace(str, rb_str_casemap(str, &flags, enc));    if (ONIGENC_CASE_MODIFIED&flags) return str;    return Qnil;}

LikeString#swapcase, except that:

Changes are made toself, not to copy ofself.
Returnsself if any changes are made,nil otherwise.

Related: seeModifying.

to_c → complex

Source

static VALUEstring_to_c(VALUE self){    VALUE num;    rb_must_asciicompat(self);    (void)parse_comp(rb_str_fill_terminator(self, 1), FALSE, &num);    return num;}

Returnsself interpreted as aComplex object; leading whitespace and trailing garbage are ignored:

'9'.to_c# => (9+0i)'2.5'.to_c# => (2.5+0i)'2.5/1'.to_c# => ((5/2)+0i)'-3/2'.to_c# => ((-3/2)+0i)'-i'.to_c# => (0-1i)'45i'.to_c# => (0+45i)'3-4i'.to_c# => (3-4i)'-4e2-4e-2i'.to_c# => (-400.0-0.04i)'-0.0-0.0i'.to_c# => (-0.0-0.0i)'1/2+3/4i'.to_c# => ((1/2)+(3/4)*i)'1.0@0'.to_c# => (1+0.0i)"1.0@#{Math::PI/2}".to_c# => (0.0+1i)"1.0@#{Math::PI}".to_c# => (-1+0.0i)

Returns Complex zero if the string cannot be converted:

'ruby'.to_c# => (0+0i)

SeeKernel#Complex.

to_f → float

Source

static VALUErb_str_to_f(VALUE str){    return DBL2NUM(rb_str_to_dbl(str, FALSE));}

Returns the result of interpreting leading characters in +self+ as a Float:  '3.14159'.to_f  # => 3.14159  '1.234e-2'.to_f # => 0.01234Characters past a leading valid number are ignored:  '3.14 (pi to two places)'.to_f # => 3.14Returns zero if there is no leading valid number:  'abcdef'.to_f # => 0.0

SeeConverting to Non-String.

to_i(base = 10) → integer

Source

static VALUErb_str_to_i(int argc, VALUE *argv, VALUE str){    int base = 10;    if (rb_check_arity(argc, 0, 1) && (base = NUM2INT(argv[0])) < 0) {        rb_raise(rb_eArgError, "invalid radix %d", base);    }    return rb_str_to_inum(str, base, FALSE);}

Returns the result of interpreting leading characters inself as an integer in the givenbase (which must be in (0, 2..36)):

'123456'.to_i# => 123456'123def'.to_i(16)# => 1195503

Withbase zero, stringobject may contain leading characters to specify the actual base:

'123def'.to_i(0)# => 123'0123def'.to_i(0)# => 83'0b123def'.to_i(0)# => 1'0o123def'.to_i(0)# => 83'0d123def'.to_i(0)# => 123'0x123def'.to_i(0)# => 1195503

Characters past a leading valid number (in the givenbase) are ignored:

'12.345'.to_i# => 12'12345'.to_i(2)# => 1

Returns zero if there is no leading valid number:

'abcdef'.to_i# => 0'2'.to_i(2)# => 0

to_json_raw(*args)

Source

# File ext/json/lib/json/add/string.rb, line 32defto_json_raw(...)to_json_raw_object.to_json(...)end

This method creates aJSON text from the result of a call toto_json_raw_object of thisString.

to_json_raw_object()

Source

# File ext/json/lib/json/add/string.rb, line 21defto_json_raw_object  {JSON.create_id=>self.class.name,"raw"=>unpack("C*"),  }end

This method creates a raw object hash, that can be nested into other data structures and will be generated as a raw string. This method should be used, if you want to convert raw strings toJSON instead of UTF-8 strings, e. g. binary data.

to_r → rational

Source

static VALUEstring_to_r(VALUE self){    VALUE num;    rb_must_asciicompat(self);    num = parse_rat(RSTRING_PTR(self), RSTRING_END(self), 0, TRUE);    if (RB_FLOAT_TYPE_P(num) && !FLOAT_ZERO_P(num))        rb_raise(rb_eFloatDomainError, "Infinity");    return num;}

Returns the result of interpreting leading characters instr as a rational. Leading whitespace and extraneous characters past the end of a valid number are ignored. Digit sequences can be separated by an underscore. If there is not a valid number at the start ofstr, zero is returned. This method never raises an exception.

'  2  '.to_r#=> (2/1)'300/2'.to_r#=> (150/1)'-9.2'.to_r#=> (-46/5)'-9.2e2'.to_r#=> (-920/1)'1_234_567'.to_r#=> (1234567/1)'21 June 09'.to_r#=> (21/1)'21/06/09'.to_r#=> (7/2)'BWV 1079'.to_r#=> (0/1)

NOTE: “0.3”.to_r isn’t the same as 0.3.to_r. The former is equivalent to “3/10”.to_r, but the latter isn’t so.

"0.3".to_r==3/10r#=> true0.3.to_r==3/10r#=> false

Movatterモバイル変換

class String

Substitution Methods¶↑

Whitespace in Strings¶↑

What’s Here¶↑

Creating a String¶↑

Freezing/Unfreezing¶↑

Querying¶↑

Comparing¶↑

Modifying¶↑

Converting to New String¶↑

Converting to Non-String¶↑

Iterating¶↑