class String

AString object has an arbitrary sequence of bytes, typically representing text or binary data. AString object may be created usingString::new or as literals.

String objects differ fromSymbol objects in thatSymbol objects are designed to be used as identifiers, instead of text or data.

You can create aString object explicitly with:

You can convert certain objects to Strings with:

SomeString methods modifyself. Typically, a method whose name ends with! modifiesself and returnsself; often, a similarly named method (without the!) returns a new string.

In general, if both bang and non-bang versions of a method exist, the bang method mutates and the non-bang method does not. However, a method without a bang can also mutate, such asString#replace.

Substitution Methods

These methods perform substitutions:

Each of these methods takes:

The examples in this section mostly use theString#sub andString#gsub methods; the principles illustrated apply to all four substitution methods.

Argumentpattern

Argumentpattern is commonly a regular expression:

s ='hello's.sub(/[aeiou]/,'*')# => "h*llo"s.gsub(/[aeiou]/,'*')# => "h*ll*"s.gsub(/[aeiou]/,'')# => "hll"s.sub(/ell/,'al')# => "halo"s.gsub(/xyzzy/,'*')# => "hello"'THX1138'.gsub(/\d+/,'00')# => "THX00"

Whenpattern is a string, all its characters are treated as ordinary characters (not asRegexp special characters):

'THX1138'.gsub('\d+','00')# => "THX1138"

Stringreplacement

Ifreplacement is a string, that string determines the replacing string that is substituted for the matched text.

Each of the examples above uses a simple string as the replacing string.

Stringreplacement may contain back-references to the pattern’s captures:

SeeRegexp for details.

Note that within the stringreplacement, a character combination such as$& is treated as ordinary text, not as a special match variable. However, you may refer to some special match variables using these combinations:

SeeRegexp for details.

Note that\\ is interpreted as an escape, i.e., a single backslash.

Note also that a string literal consumes backslashes. SeeString Literals for details about string literals.

A back-reference is typically preceded by an additional backslash. For example, if you want to write a back-reference\& inreplacement with a double-quoted string literal, you need to write"..\\&..".

If you want to write a non-back-reference string\& inreplacement, you need to first escape the backslash to prevent this method from interpreting it as a back-reference, and then you need to escape the backslashes again to prevent a string literal from consuming them:"..\\\\&..".

You may want to use the block form to avoid excessive backslashes.

Hashreplacement

If the argumentreplacement is a hash, andpattern matches one of its keys, the replacing string is the value for that key:

h = {'foo'=>'bar','baz'=>'bat'}'food'.sub('foo',h)# => "bard"

Note that a symbol key does not match:

h = {foo:'bar',baz:'bat'}'food'.sub('foo',h)# => "d"

Block

In the block form, the current match string is passed to the block; the block’s return value becomes the replacing string:

s ='@''1234'.gsub(/\d/) {|match|s.succ! }# => "ABCD"

Special match variables such as$1,$2,$`,$&, and$' are set appropriately.

Whitespace in Strings

In the classString,whitespace is defined as a contiguous sequence of characters consisting of any mixture of the following:

Whitespace is relevant for the following methods:

What’s Here

First, what’s elsewhere. ClassString:

Here, classString provides methods that are useful for:

Creating a String

Freezing/Unfreezing

Querying

Counts

Substrings

Encodings

Other

Comparing

Modifying

Each of these methods modifiesself.

Insertion

Substitution

Casing

Encoding

Deletion

Converting to New String

Each of these methods returns a newString based onself, often just a modified copy ofself.

Extension

Encoding

Substitution

Casing

Deletion

Duplication

Converting to Non-String

Each of these methods converts the contents ofself to a non-String.

Characters, Bytes, and Clusters

Splitting

Matching

Numerics

Strings and Symbols

Iterating

Public Class Methods

Source
# File ext/json/lib/json/add/string.rb, line 11defself.json_create(object)object["raw"].pack("C*")end

Raw Strings areJSON Objects (the raw bytes are stored in an array for the key “raw”). The RubyString can be created by this class method.

Source
static VALUErb_str_init(int argc, VALUE *argv, VALUE str){    static ID keyword_ids[2];    VALUE orig, opt, venc, vcapa;    VALUE kwargs[2];    rb_encoding *enc = 0;    int n;    if (!keyword_ids[0]) {        keyword_ids[0] = rb_id_encoding();        CONST_ID(keyword_ids[1], "capacity");    }    n = rb_scan_args(argc, argv, "01:", &orig, &opt);    if (!NIL_P(opt)) {        rb_get_kwargs(opt, keyword_ids, 0, 2, kwargs);        venc = kwargs[0];        vcapa = kwargs[1];        if (!UNDEF_P(venc) && !NIL_P(venc)) {            enc = rb_to_encoding(venc);        }        if (!UNDEF_P(vcapa) && !NIL_P(vcapa)) {            long capa = NUM2LONG(vcapa);            long len = 0;            int termlen = enc ? rb_enc_mbminlen(enc) : 1;            if (capa < STR_BUF_MIN_SIZE) {                capa = STR_BUF_MIN_SIZE;            }            if (n == 1) {                StringValue(orig);                len = RSTRING_LEN(orig);                if (capa < len) {                    capa = len;                }                if (orig == str) n = 0;            }            str_modifiable(str);            if (STR_EMBED_P(str) || FL_TEST(str, STR_SHARED|STR_NOFREE)) {                /* make noembed always */                const size_t size = (size_t)capa + termlen;                const char *const old_ptr = RSTRING_PTR(str);                const size_t osize = RSTRING_LEN(str) + TERM_LEN(str);                char *new_ptr = ALLOC_N(char, size);                if (STR_EMBED_P(str)) RUBY_ASSERT((long)osize <= str_embed_capa(str));                memcpy(new_ptr, old_ptr, osize < size ? osize : size);                FL_UNSET_RAW(str, STR_SHARED|STR_NOFREE);                RSTRING(str)->as.heap.ptr = new_ptr;            }            else if (STR_HEAP_SIZE(str) != (size_t)capa + termlen) {                SIZED_REALLOC_N(RSTRING(str)->as.heap.ptr, char,                        (size_t)capa + termlen, STR_HEAP_SIZE(str));            }            STR_SET_LEN(str, len);            TERM_FILL(&RSTRING(str)->as.heap.ptr[len], termlen);            if (n == 1) {                memcpy(RSTRING(str)->as.heap.ptr, RSTRING_PTR(orig), len);                rb_enc_cr_str_exact_copy(str, orig);            }            FL_SET(str, STR_NOEMBED);            RSTRING(str)->as.heap.aux.capa = capa;        }        else if (n == 1) {            rb_str_replace(str, orig);        }        if (enc) {            rb_enc_associate(str, enc);            ENC_CODERANGE_CLEAR(str);        }    }    else if (n == 1) {        rb_str_replace(str, orig);    }    return str;}

Returns a new String object containing the givenstring.

Theoptions are optional keyword options (see below).

With no argument given and keywordencoding also not given, returns an empty string with theEncodingASCII-8BIT:

s =String.new# => ""s.encoding# => #<Encoding:ASCII-8BIT>

With argumentstring given and keyword optionencoding not given, returns a new string with the same encoding asstring:

s0 ='foo'.encode(Encoding::UTF_16)s1 =String.new(s0)s1.encoding# => #<Encoding:UTF-16 (dummy)>

(Unlike String.new, astring literal like'' or ahere document literal always hasscript encoding.)

With keyword optionencoding given, returns a string with the specified encoding; theencoding may be anEncoding object, an encoding name, or an encoding name alias:

String.new(encoding:Encoding::US_ASCII).encoding# => #<Encoding:US-ASCII>String.new('',encoding:Encoding::US_ASCII).encoding# => #<Encoding:US-ASCII>String.new('foo',encoding:Encoding::US_ASCII).encoding# => #<Encoding:US-ASCII>String.new('foo',encoding:'US-ASCII').encoding# => #<Encoding:US-ASCII>String.new('foo',encoding:'ASCII').encoding# => #<Encoding:US-ASCII>

The given encoding need not be valid for the string’s content, and its validity is not checked:

s =String.new('こんにちは',encoding:'ascii')s.valid_encoding?# => false

But the givenencoding itself is checked:

String.new('foo',encoding:'bar')# Raises ArgumentError.

With keyword optioncapacity given, the given value is advisory only, and may or may not set the size of the internal buffer, which may in turn affect performance:

String.new('foo',capacity:1)# Buffer size is at least 4 (includes terminal null byte).String.new('foo',capacity:4096)# Buffer size is at least 4;# may be equal to, greater than, or less than 4096.
Source
static VALUErb_str_s_try_convert(VALUE dummy, VALUE str){    return rb_check_string_type(str);}

Attempts to convert the givenobject to a string.

Ifobject is already a string, returnsobject, unmodified.

Otherwise ifobject responds to:to_str, callsobject.to_str and returns the result.

Returnsnil ifobject does not respond to:to_str.

Raises an exception unlessobject.to_str returns a string.

Public Instance Methods

Source
static VALUErb_str_format_m(VALUE str, VALUE arg){    VALUE tmp = rb_check_array_type(arg);    if (!NIL_P(tmp)) {        return rb_str_format(RARRAY_LENINT(tmp), RARRAY_CONST_PTR(tmp), str);    }    return rb_str_format(1, &arg, str);}

Returns the result of formattingobject into the format specifications contained inself (seeFormat Specifications):

'%05d'%123# => "00123"

Ifself contains multiple format specifications,object must be an array or hash containing the objects to be formatted:

'%-5s: %016x'% ['ID',self.object_id ]# => "ID   : 00002b054ec93168"'foo = %{foo}'% {foo:'bar'}# => "foo = bar"'foo = %{foo}, baz = %{baz}'% {foo:'bar',baz:'bat'}# => "foo = bar, baz = bat"

Related: seeConverting to New String.

Source
VALUErb_str_times(VALUE str, VALUE times){    VALUE str2;    long n, len;    char *ptr2;    int termlen;    if (times == INT2FIX(1)) {        return str_duplicate(rb_cString, str);    }    if (times == INT2FIX(0)) {        str2 = str_alloc_embed(rb_cString, 0);        rb_enc_copy(str2, str);        return str2;    }    len = NUM2LONG(times);    if (len < 0) {        rb_raise(rb_eArgError, "negative argument");    }    if (RSTRING_LEN(str) == 1 && RSTRING_PTR(str)[0] == 0) {        if (STR_EMBEDDABLE_P(len, 1)) {            str2 = str_alloc_embed(rb_cString, len + 1);            memset(RSTRING_PTR(str2), 0, len + 1);        }        else {            str2 = str_alloc_heap(rb_cString);            RSTRING(str2)->as.heap.aux.capa = len;            RSTRING(str2)->as.heap.ptr = ZALLOC_N(char, (size_t)len + 1);        }        STR_SET_LEN(str2, len);        rb_enc_copy(str2, str);        return str2;    }    if (len && LONG_MAX/len <  RSTRING_LEN(str)) {        rb_raise(rb_eArgError, "argument too big");    }    len *= RSTRING_LEN(str);    termlen = TERM_LEN(str);    str2 = str_enc_new(rb_cString, 0, len, STR_ENC_GET(str));    ptr2 = RSTRING_PTR(str2);    if (len) {        n = RSTRING_LEN(str);        memcpy(ptr2, RSTRING_PTR(str), n);        while (n <= len/2) {            memcpy(ptr2 + n, ptr2, n);            n *= 2;        }        memcpy(ptr2 + n, ptr2, len-n);    }    STR_SET_LEN(str2, len);    TERM_FILL(&ptr2[len], termlen);    rb_enc_cr_str_copy_for_substr(str2, str);    return str2;}

Returns a new string containingn copies ofself:

'Ho!'*3# => "Ho!Ho!Ho!"'No!'*0# => ""

Related: seeConverting to New String.

Source
VALUErb_str_plus(VALUE str1, VALUE str2){    VALUE str3;    rb_encoding *enc;    char *ptr1, *ptr2, *ptr3;    long len1, len2;    int termlen;    StringValue(str2);    enc = rb_enc_check_str(str1, str2);    RSTRING_GETMEM(str1, ptr1, len1);    RSTRING_GETMEM(str2, ptr2, len2);    termlen = rb_enc_mbminlen(enc);    if (len1 > LONG_MAX - len2) {        rb_raise(rb_eArgError, "string size too big");    }    str3 = str_enc_new(rb_cString, 0, len1+len2, enc);    ptr3 = RSTRING_PTR(str3);    memcpy(ptr3, ptr1, len1);    memcpy(ptr3+len1, ptr2, len2);    TERM_FILL(&ptr3[len1+len2], termlen);    ENCODING_CODERANGE_SET(str3, rb_enc_to_index(enc),                           ENC_CODERANGE_AND(ENC_CODERANGE(str1), ENC_CODERANGE(str2)));    RB_GC_GUARD(str1);    RB_GC_GUARD(str2);    return str3;}

Returns a new string containingother_string concatenated toself:

'Hello from '+self.to_s# => "Hello from main"

Related: seeConverting to New String.

Source
static VALUEstr_uplus(VALUE str){    if (OBJ_FROZEN(str) || CHILLED_STRING_P(str)) {        return rb_str_dup(str);    }    else {        return str;    }}

Returnsself ifself is not frozen and can be mutated without warning issuance.

Otherwise returnsself.dup, which is not frozen.

Related: seeFreezing/Unfreezing.

Source
static VALUEstr_uminus(VALUE str){    if (!BARE_STRING_P(str) && !rb_obj_frozen_p(str)) {        str = rb_str_dup(str);    }    return rb_fstring(str);}

Returns a frozen string equal toself.

The returned string isself if and only if all of the following are true:

  • self is already frozen.

  • self is an instance of String (rather than of a subclass of String)

  • self has no instance variables set on it.

Otherwise, the returned string is a frozen copy ofself.

Returningself, when possible, saves duplicatingself; seeData deduplication.

It may also save duplicating other, already-existing, strings:

s0 ='foo's1 ='foo's0.object_id==s1.object_id# => false(-s0).object_id== (-s1).object_id# => true

Note that method-@ is convenient for defining a constant:

FileName =-'config/database.yml'

While its aliasdedup is better suited for chaining:

'foo'.dedup.gsub!('o')

Related: seeFreezing/Unfreezing.

Also aliased as:dedup
Source
VALUErb_str_concat(VALUE str1, VALUE str2){    unsigned int code;    rb_encoding *enc = STR_ENC_GET(str1);    int encidx;    if (RB_INTEGER_TYPE_P(str2)) {        if (rb_num_to_uint(str2, &code) == 0) {        }        else if (FIXNUM_P(str2)) {            rb_raise(rb_eRangeError, "%ld out of char range", FIX2LONG(str2));        }        else {            rb_raise(rb_eRangeError, "bignum out of char range");        }    }    else {        return rb_str_append(str1, str2);    }    encidx = rb_ascii8bit_appendable_encoding_index(enc, code);    if (encidx >= 0) {        rb_str_buf_cat_byte(str1, (unsigned char)code);    }    else {        long pos = RSTRING_LEN(str1);        int cr = ENC_CODERANGE(str1);        int len;        char *buf;        switch (len = rb_enc_codelen(code, enc)) {          case ONIGERR_INVALID_CODE_POINT_VALUE:            rb_raise(rb_eRangeError, "invalid codepoint 0x%X in %s", code, rb_enc_name(enc));            break;          case ONIGERR_TOO_BIG_WIDE_CHAR_VALUE:          case 0:            rb_raise(rb_eRangeError, "%u out of char range", code);            break;        }        buf = ALLOCA_N(char, len + 1);        rb_enc_mbcput(code, buf, enc);        if (rb_enc_precise_mbclen(buf, buf + len + 1, enc) != len) {            rb_raise(rb_eRangeError, "invalid codepoint 0x%X in %s", code, rb_enc_name(enc));        }        rb_str_resize(str1, pos+len);        memcpy(RSTRING_PTR(str1) + pos, buf, len);        if (cr == ENC_CODERANGE_7BIT && code > 127) {            cr = ENC_CODERANGE_VALID;        }        else if (cr == ENC_CODERANGE_BROKEN) {            cr = ENC_CODERANGE_UNKNOWN;        }        ENC_CODERANGE_SET(str1, cr);    }    return str1;}

Appends a string representation ofobject toself; returnsself.

Ifobject is a string, appends it toself:

s ='foo's<<'bar'# => "foobar"s# => "foobar"

Ifobject is an integer, its value is considered a codepoint; converts the value to a character before concatenating:

s ='foo's<<33# => "foo!"

Additionally, if the codepoint is in range0..0xff and the encoding ofself is Encoding::US_ASCII, changes the encoding to Encoding::ASCII_8BIT:

s ='foo'.encode(Encoding::US_ASCII)s.encoding# => #<Encoding:US-ASCII>s<<0xff# => "foo\xFF"s.encoding# => #<Encoding:BINARY (ASCII-8BIT)>

RaisesRangeError if that codepoint is not representable in the encoding ofself:

s ='foo's.encoding# => <Encoding:UTF-8>s<<0x00110000# 1114112 out of char range (RangeError)s ='foo'.encode(Encoding::EUC_JP)s<<0x00800080# invalid codepoint 0x800080 in EUC-JP (RangeError)

Related: seeModifying.

Source
static VALUErb_str_cmp_m(VALUE str1, VALUE str2){    int result;    VALUE s = rb_check_string_type(str2);    if (NIL_P(s)) {        return rb_invcmp(str1, str2);    }    result = rb_str_cmp(str1, s);    return INT2FIX(result);}

Comparesself andother_string, returning:

  • -1 ifother_string is larger.

  • 0 if the two are equal.

  • 1 ifother_string is smaller.

  • nil if the two are incomparable.

Examples:

'foo'<=>'foo'# => 0'foo'<=>'food'# => -1'food'<=>'foo'# => 1'FOO'<=>'foo'# => -1'foo'<=>'FOO'# => 1'foo'<=>1# => nil

Related: seeComparing.

Source
VALUErb_str_equal(VALUE str1, VALUE str2){    if (str1 == str2) return Qtrue;    if (!RB_TYPE_P(str2, T_STRING)) {        if (!rb_respond_to(str2, idTo_str)) {            return Qfalse;        }        return rb_equal(str2, str1);    }    return rb_str_eql_internal(str1, str2);}

Returns whetherobject is equal toself.

Whenobject is a string, returns whetherobject has the same length and content asself:

s ='foo's=='foo'# => trues=='food'# => falses=='FOO'# => false

Returnsfalse if the two strings’ encodings are not compatible:

"\u{e4 f6 fc}".encode(Encoding::ISO_8859_1)== ("\u{c4 d6 dc}")# => false

Whenobject is not a string:

  • Ifobject responds to methodto_str,object == self is called and its return value is returned.

  • Ifobject does not respond toto_str,false is returned.

Related:Comparing.

Also aliased as:===
Alias for:==
Source
static VALUErb_str_match(VALUE x, VALUE y){    switch (OBJ_BUILTIN_TYPE(y)) {      case T_STRING:        rb_raise(rb_eTypeError, "type mismatch: String given");      case T_REGEXP:        return rb_reg_match(y, x);      default:        return rb_funcall(y, idEqTilde, 1, x);    }}

Whenobject is aRegexp, returns the index of the first substring inself matched byobject, ornil if no match is found; updatesRegexp-related global variables:

'foo'=~/f/# => 0$~# => #<MatchData "f">'foo'=~/o/# => 1$~# => #<MatchData "o">'foo'=~/x/# => nil$~# => nil

Note thatstring =~ regexp is different fromregexp =~ string (seeRegexp#=~):

number =nil'no. 9'=~/(?<number>\d+)/# => 4number# => nil # Not assigned./(?<number>\d+)/=~'no. 9'# => 4number# => "9" # Assigned.

Ifobject is not aRegexp, returns the value returned byobject =~ self.

Related: seeQuerying.

Source
static VALUErb_str_aref_m(int argc, VALUE *argv, VALUE str){    if (argc == 2) {        if (RB_TYPE_P(argv[0], T_REGEXP)) {            return rb_str_subpat(str, argv[0], argv[1]);        }        else {            return rb_str_substr_two_fixnums(str, argv[0], argv[1], TRUE);        }    }    rb_check_arity(argc, 1, 2);    return rb_str_aref(str, argv[0]);}

Returns the substring ofself specified by the arguments.

Formself[index]

With non-negative integer argumentindex given, returns the 1-character substring found in self at character offset index:

'hello'[0]# => "h"'hello'[4]# => "o"'hello'[5]# => nil'тест'[2]# => "с"'こんにちは'[4]# => "は"

With negative integer argumentindex given, counts backward from the end ofself:

'hello'[-1]# => "o"'hello'[-5]# => "h"'hello'[-6]# => nil

Formself[start, length]

With integer argumentsstart andlength given, returns a substring of sizelength characters (as available) beginning at character offset specified bystart.

If argumentstart is non-negative, the offset isstart:

'hello'[0,1]# => "h"'hello'[0,5]# => "hello"'hello'[0,6]# => "hello"'hello'[2,3]# => "llo"'hello'[2,0]# => ""'hello'[2,-1]# => nil

If argumentstart is negative, counts backward from the end ofself:

'hello'[-1,1]# => "o"'hello'[-5,5]# => "hello"'hello'[-1,0]# => ""'hello'[-6,5]# => nil

Special case: ifstart equals the length ofself, returns a new empty string:

'hello'[5,3]# => ""

Formself[range]

WithRange argumentrange given, forms substringself[range.start, range.size]:

'hello'[0..2]# => "hel"'hello'[0,3]# => "hel"'hello'[0...2]# => "he"'hello'[0,2]# => "he"'hello'[0,0]# => ""'hello'[0...0]# => ""

Formself[regexp, capture = 0]

WithRegexp argumentregexp given andcapture as zero, searches for a matching substring inself; updatesRegexp-related global variables:

'hello'[/ell/]# => "ell"'hello'[/l+/]# => "ll"'hello'[//]# => ""'hello'[/nosuch/]# => nil

Withcapture as a positive integern, returns the +n+th matched group:

'hello'[/(h)(e)(l+)(o)/]# => "hello"'hello'[/(h)(e)(l+)(o)/,1]# => "h"$1# => "h"'hello'[/(h)(e)(l+)(o)/,2]# => "e"$2# => "e"'hello'[/(h)(e)(l+)(o)/,3]# => "ll"'hello'[/(h)(e)(l+)(o)/,4]# => "o"'hello'[/(h)(e)(l+)(o)/,5]# => nil

Formself[substring]

With string argumentsubstring given, returns the matching substring ofself, if found:

'hello'['ell']# => "ell"'hello'['']# => ""'hello'['nosuch']# => nil'тест'['ес']# => "ес"'こんにちは'['んにち']# => "んにち"

Related: seeConverting to New String.

Also aliased as:slice
Source
static VALUErb_str_aset_m(int argc, VALUE *argv, VALUE str){    if (argc == 3) {        if (RB_TYPE_P(argv[0], T_REGEXP)) {            rb_str_subpat_set(str, argv[0], argv[1], argv[2]);        }        else {            rb_str_update(str, NUM2LONG(argv[0]), NUM2LONG(argv[1]), argv[2]);        }        return argv[2];    }    rb_check_arity(argc, 2, 3);    return rb_str_aset(str, argv[0], argv[1]);}

Returnsself with all, a substring, or none of its contents replaced; returns the argumentother_string.

Formself[index] = other_string

With non-negative integer argumentindex given, searches for the 1-character substring found in self at character offset index:

s ='hello's[0] ='foo'# => "foo"s# => "fooello"s ='hello's[4] ='foo'# => "foo"s# => "hellfoo"s ='hello's[5] ='foo'# => "foo"s# => "hellofoo"s ='hello's[6] ='foo'# Raises IndexError: index 6 out of string.

With negative integer argumentindex given, counts backward from the end ofself:

s ='hello's[-1] ='foo'# => "foo"s# => "hellfoo"s ='hello's[-5] ='foo'# => "foo"s# => "fooello"s ='hello's[-6] ='foo'# Raises IndexError: index -6 out of string.

Formself[start, length] = other_string

With integer argumentsstart andlength given, searches for a substring of sizelength characters (as available) beginning at character offset specified bystart.

If argumentstart is non-negative, the offset is +start’:

s ='hello's[0,1] ='foo'# => "foo"s# => "fooello"s ='hello's[0,5] ='foo'# => "foo"s# => "foo"s ='hello's[0,9] ='foo'# => "foo"s# => "foo"s ='hello's[2,0] ='foo'# => "foo"s# => "hefoollo"s ='hello's[2,-1] ='foo'# Raises IndexError: negative length -1.

If argumentstart is negative, counts backward from the end ofself:

s ='hello's[-1,1] ='foo'# => "foo"s# => "hellfoo"s ='hello's[-1,9] ='foo'# => "foo"s# => "hellfoo"s ='hello's[-5,2] ='foo'# => "foo"s# => "foollo"s ='hello's[-3,0] ='foo'# => "foo"s# => "hefoollo"s ='hello's[-6,2] ='foo'# Raises IndexError: index -6 out of string.

Special case: ifstart equals the length ofself, the argument is appended toself:

s ='hello's[5,3] ='foo'# => "foo"s# => "hellofoo"

Formself[range] = other_string

WithRange argumentrange given, equivalent toself[range.start, range.size] = other_string:

s0 ='hello's1 ='hello's0[0..2] ='foo'# => "foo"s1[0,3] ='foo'# => "foo"s0# => "foolo"s1# => "foolo"s ='hello's[0...2] ='foo'# => "foo"s# => "foollo"s ='hello's[0...0] ='foo'# => "foo"s# => "foohello"s ='hello's[9..10] ='foo'# Raises RangeError: 9..10 out of range

Formself[regexp, capture = 0] = other_string

WithRegexp argumentregexp given andcapture as zero, searches for a matching substring inself; updatesRegexp-related global variables:

s ='hello's[/l/] ='L'# => "L"[$`,$&,$']# => ["he", "l", "lo"]s[/eLlo/] ='owdy'# => "owdy"[$`,$&,$']# => ["h", "eLlo", ""]s[/eLlo/] ='owdy'# Raises IndexError: regexp not matched.[$`,$&,$']# => [nil, nil, nil]

Withcapture as a positive integern, searches for the +n+th matched group:

s = 'hello's[/(h)(e)(l+)(o)/] = 'foo'    # => "foo"[$`, $&, $']                  # => ["", "hello", ""]s = 'hello's[/(h)(e)(l+)(o)/, 1] = 'foo' # => "foo"s                             # => "fooello"[$`, $&, $']                  # => ["", "hello", ""]s = 'hello's[/(h)(e)(l+)(o)/, 2] = 'foo' # => "foo"s                             # => "hfoollo"[$`, $&, $']                  # => ["", "hello", ""]s = 'hello's[/(h)(e)(l+)(o)/, 4] = 'foo' # => "foo"s                             # => "hellfoo"[$`, $&, $']                  # => ["", "hello", ""]s = 'hello'# => "hello"s[/(h)(e)(l+)(o)/, 5] = 'foo  # Raises IndexError: index 5 out of regexp.s = 'hello's[/nosuch/] = 'foo'           # Raises IndexError: regexp not matched.

Formself[substring] = other_string

With string argumentsubstring given:

s ='hello's['l'] ='foo'# => "foo"s# => "hefoolo"s ='hello's['ll'] ='foo'# => "foo"s# => "hefooo"s ='тест's['ес'] ='foo'# => "foo"s# => "тfooт"s ='こんにちは's['んにち'] ='foo'# => "foo"s# => "こfooは"s['nosuch'] ='foo'# Raises IndexError: string not matched.

Related: seeModifying.

Source
VALUErb_str_append_as_bytes(int argc, VALUE *argv, VALUE str){    long needed_capacity = 0;    volatile VALUE t0;    enum ruby_value_type *types = ALLOCV_N(enum ruby_value_type, t0, argc);    for (int index = 0; index < argc; index++) {        VALUE obj = argv[index];        enum ruby_value_type type = types[index] = rb_type(obj);        switch (type) {          case T_FIXNUM:          case T_BIGNUM:            needed_capacity++;            break;          case T_STRING:            needed_capacity += RSTRING_LEN(obj);            break;          default:            rb_raise(                rb_eTypeError,                "wrong argument type %"PRIsVALUE" (expected String or Integer)",                rb_obj_class(obj)            );            break;        }    }    str_ensure_available_capa(str, needed_capacity);    char *sptr = RSTRING_END(str);    for (int index = 0; index < argc; index++) {        VALUE obj = argv[index];        enum ruby_value_type type = types[index];        switch (type) {          case T_FIXNUM:          case T_BIGNUM: {            argv[index] = obj = rb_int_and(obj, INT2FIX(0xff));            char byte = (char)(NUM2INT(obj) & 0xFF);            *sptr = byte;            sptr++;            break;          }          case T_STRING: {            const char *ptr;            long len;            RSTRING_GETMEM(obj, ptr, len);            memcpy(sptr, ptr, len);            sptr += len;            break;          }          default:            rb_bug("append_as_bytes arguments should have been validated");        }    }    STR_SET_LEN(str, RSTRING_LEN(str) + needed_capacity);    TERM_FILL(sptr, TERM_LEN(str)); /* sentinel */    int cr = ENC_CODERANGE(str);    switch (cr) {      case ENC_CODERANGE_7BIT: {        for (int index = 0; index < argc; index++) {            VALUE obj = argv[index];            enum ruby_value_type type = types[index];            switch (type) {              case T_FIXNUM:              case T_BIGNUM: {                if (!ISASCII(NUM2INT(obj))) {                    goto clear_cr;                }                break;              }              case T_STRING: {                if (ENC_CODERANGE(obj) != ENC_CODERANGE_7BIT) {                    goto clear_cr;                }                break;              }              default:                rb_bug("append_as_bytes arguments should have been validated");            }        }        break;      }      case ENC_CODERANGE_VALID:        if (ENCODING_GET_INLINED(str) == ENCINDEX_ASCII_8BIT) {            goto keep_cr;        }        else {            goto clear_cr;        }        break;      default:        goto clear_cr;        break;    }    RB_GC_GUARD(t0);  clear_cr:    // If no fast path was hit, we clear the coderange.    // append_as_bytes is predominently meant to be used in    // buffering situation, hence it's likely the coderange    // will never be scanned, so it's not worth spending time    // precomputing the coderange except for simple and common    // situations.    ENC_CODERANGE_CLEAR(str);  keep_cr:    return str;}

Concatenates each object inobjects intoself; returnsself; performs no encoding validation or conversion:

s ='foo's.append_as_bytes(" \xE2\x82")# => "foo \xE2\x82"s.valid_encoding?# => falses.append_as_bytes("\xAC 12")s.valid_encoding?# => true

When a given object is an integer, the value is considered an 8-bit byte; if the integer occupies more than one byte (i.e,. is greater than 255), appends only the low-order byte (similar toString#setbyte):

s =""s.append_as_bytes(0,257)# => "\u0000\u0001"s.bytesize# => 2

Related: seeModifying.

Source
static VALUErb_str_is_ascii_only_p(VALUE str){    int cr = rb_enc_str_coderange(str);    return RBOOL(cr == ENC_CODERANGE_7BIT);}

Returns whetherself contains only ASCII characters:

'abc'.ascii_only?# => true"abc\u{6666}".ascii_only?# => false

Related: seeQuerying.

Source
static VALUErb_str_b(VALUE str){    VALUE str2;    if (STR_EMBED_P(str)) {        str2 = str_alloc_embed(rb_cString, RSTRING_LEN(str) + TERM_LEN(str));    }    else {        str2 = str_alloc_heap(rb_cString);    }    str_replace_shared_without_enc(str2, str);    if (rb_enc_asciicompat(STR_ENC_GET(str))) {        // BINARY strings can never be broken; they're either 7-bit ASCII or VALID.        // If we know the receiver's code range then we know the result's code range.        int cr = ENC_CODERANGE(str);        switch (cr) {          case ENC_CODERANGE_7BIT:            ENC_CODERANGE_SET(str2, ENC_CODERANGE_7BIT);            break;          case ENC_CODERANGE_BROKEN:          case ENC_CODERANGE_VALID:            ENC_CODERANGE_SET(str2, ENC_CODERANGE_VALID);            break;          default:            ENC_CODERANGE_CLEAR(str2);            break;        }    }    return str2;}

Returns a copy ofself that has ASCII-8BIT encoding; the underlying bytes are not modified:

s ="\x99"s.encoding# => #<Encoding:UTF-8>t =s.b# => "\x99"t.encoding# => #<Encoding:ASCII-8BIT>s ="\u4095"# => "䂕"s.encoding# => #<Encoding:UTF-8>s.bytes# => [228, 130, 149]t =s.b# => "\xE4\x82\x95"t.encoding# => #<Encoding:ASCII-8BIT>t.bytes# => [228, 130, 149]

Related: seeConverting to New String.

Source
static VALUErb_str_byteindex_m(int argc, VALUE *argv, VALUE str){    VALUE sub;    VALUE initpos;    long pos;    if (rb_scan_args(argc, argv, "11", &sub, &initpos) == 2) {        long slen = RSTRING_LEN(str);        pos = NUM2LONG(initpos);        if (pos < 0 ? (pos += slen) < 0 : pos > slen) {            if (RB_TYPE_P(sub, T_REGEXP)) {                rb_backref_set(Qnil);            }            return Qnil;        }    }    else {        pos = 0;    }    str_ensure_byte_pos(str, pos);    if (RB_TYPE_P(sub, T_REGEXP)) {        if (rb_reg_search(sub, str, pos, 0) >= 0) {            VALUE match = rb_backref_get();            struct re_registers *regs = RMATCH_REGS(match);            pos = BEG(0);            return LONG2NUM(pos);        }    }    else {        StringValue(sub);        pos = rb_str_byteindex(str, sub, pos);        if (pos >= 0) return LONG2NUM(pos);    }    return Qnil;}

Returns the 0-based integer index of a substring ofself specified byobject (a string orRegexp) andoffset, ornil if there is no such substring; the returned index is the count ofbytes (not characters).

Whenobject is a string, returns the index of the first found substring equal toobject:

s ='foo'# => "foo"s.size# => 3 # Three 1-byte characters.s.bytesize# => 3 # Three bytes.s.byteindex('f')# => 0s.byteindex('o')# => 1s.byteindex('oo')# => 1s.byteindex('ooo')# => nil

Whenobject is aRegexp, returns the index of the first found substring matchingobject; updatesRegexp-related global variables:

s ='foo's.byteindex(/f/)# => 0$~# => #<MatchData "f">s.byteindex(/o/)# => 1s.byteindex(/oo/)# => 1s.byteindex(/ooo/)# => nil$~# => nil

Integer argumentoffset, if given, specifies the 0-based index of the byte where searching is to begin.

Whenoffset is non-negative, searching begins at byte positionoffset:

s ='foo's.byteindex('o',1)# => 1s.byteindex('o',2)# => 2s.byteindex('o',3)# => nil

Whenoffset is negative, counts backward from the end ofself:

s ='foo's.byteindex('o',-1)# => 2s.byteindex('o',-2)# => 1s.byteindex('o',-3)# => 1s.byteindex('o',-4)# => nil

RaisesIndexError if the byte atoffset is not the first byte of a character:

s ="\uFFFF\uFFFF"# => "\uFFFF\uFFFF"s.size# => 2 # Two 3-byte characters.s.bytesize# => 6 # Six bytes.s.byteindex("\uFFFF")# => 0s.byteindex("\uFFFF",1)# Raises IndexErrors.byteindex("\uFFFF",2)# Raises IndexErrors.byteindex("\uFFFF",3)# => 3s.byteindex("\uFFFF",4)# Raises IndexErrors.byteindex("\uFFFF",5)# Raises IndexErrors.byteindex("\uFFFF",6)# => nil

Related: seeQuerying.

Source
static VALUErb_str_byterindex_m(int argc, VALUE *argv, VALUE str){    VALUE sub;    VALUE initpos;    long pos, len = RSTRING_LEN(str);    if (rb_scan_args(argc, argv, "11", &sub, &initpos) == 2) {        pos = NUM2LONG(initpos);        if (pos < 0 && (pos += len) < 0) {            if (RB_TYPE_P(sub, T_REGEXP)) {                rb_backref_set(Qnil);            }            return Qnil;        }        if (pos > len) pos = len;    }    else {        pos = len;    }    str_ensure_byte_pos(str, pos);    if (RB_TYPE_P(sub, T_REGEXP)) {        if (rb_reg_search(sub, str, pos, 1) >= 0) {            VALUE match = rb_backref_get();            struct re_registers *regs = RMATCH_REGS(match);            pos = BEG(0);            return LONG2NUM(pos);        }    }    else {        StringValue(sub);        pos = rb_str_byterindex(str, sub, pos);        if (pos >= 0) return LONG2NUM(pos);    }    return Qnil;}

Returns the 0-based integer index of a substring ofself that is thelast match for the givenobject (a string orRegexp) andoffset, ornil if there is no such substring; the returned index is the count ofbytes (not characters).

Whenobject is a string, returns the index of thelast found substring equal toobject:

s ='foo'# => "foo"s.size# => 3 # Three 1-byte characters.s.bytesize# => 3 # Three bytes.s.byterindex('f')# => 0s.byterindex('o')# => 2s.byterindex('oo')# => 1s.byterindex('ooo')# => nil

Whenobject is aRegexp, returns the index of the last found substring matchingobject; updatesRegexp-related global variables:

s ='foo's.byterindex(/f/)# => 0$~# => #<MatchData "f">s.byterindex(/o/)# => 2s.byterindex(/oo/)# => 1s.byterindex(/ooo/)# => nil$~# => nil

The last match means starting at the possible last position, not the last of the longest matches:

s ='foo's.byterindex(/o+/)# => 2$~#=> #<MatchData "o">

To get the last longest match, use a negative lookbehind:

s ='foo's.byterindex(/(?<!o)o+/)# => 1$~# => #<MatchData "oo">

Or use methodbyteindex with negative lookahead:

s ='foo's.byteindex(/o+(?!.*o)/)# => 1$~#=> #<MatchData "oo">

Integer argumentoffset, if given, specifies the 0-based index of the byte where searching is to end.

Whenoffset is non-negative, searching ends at byte positionoffset:

s ='foo's.byterindex('o',0)# => nils.byterindex('o',1)# => 1s.byterindex('o',2)# => 2s.byterindex('o',3)# => 2

Whenoffset is negative, counts backward from the end ofself:

s ='foo's.byterindex('o',-1)# => 2s.byterindex('o',-2)# => 1s.byterindex('o',-3)# => nil

RaisesIndexError if the byte atoffset is not the first byte of a character:

s ="\uFFFF\uFFFF"# => "\uFFFF\uFFFF"s.size# => 2 # Two 3-byte characters.s.bytesize# => 6 # Six bytes.s.byterindex("\uFFFF")# => 3s.byterindex("\uFFFF",1)# Raises IndexErrors.byterindex("\uFFFF",2)# Raises IndexErrors.byterindex("\uFFFF",3)# => 3s.byterindex("\uFFFF",4)# Raises IndexErrors.byterindex("\uFFFF",5)# Raises IndexErrors.byterindex("\uFFFF",6)# => nil

Related: seeQuerying.

Source
static VALUErb_str_bytes(VALUE str){    VALUE ary = WANTARRAY("bytes", RSTRING_LEN(str));    return rb_str_enumerate_bytes(str, ary);}

Returns an array of the bytes inself:

'hello'.bytes# => [104, 101, 108, 108, 111]'тест'.bytes# => [209, 130, 208, 181, 209, 129, 209, 130]'こんにちは'.bytes# => [227, 129, 147, 227, 130, 147, 227, 129, 171, 227, 129, 161, 227, 129, 175]

Related: seeConverting to Non-String.

Source
VALUErb_str_bytesize(VALUE str){    return LONG2NUM(RSTRING_LEN(str));}

Returns the count of bytes inself.

Note that the byte count may be different from the character count (returned bysize):

s ='foo's.bytesize# => 3s.size# => 3s ='тест's.bytesize# => 8s.size# => 4s ='こんにちは's.bytesize# => 15s.size# => 5

Related: seeQuerying.

Source
static VALUErb_str_byteslice(int argc, VALUE *argv, VALUE str){    if (argc == 2) {        long beg = NUM2LONG(argv[0]);        long len = NUM2LONG(argv[1]);        return str_byte_substr(str, beg, len, TRUE);    }    rb_check_arity(argc, 1, 2);    return str_byte_aref(str, argv[0]);}

Returns a substring ofself, ornil if the substring cannot be constructed.

With integer argumentsoffset andlength given, returns the substring beginning at the givenoffset and of the givenlength (as available):

s ='0123456789'# => "0123456789"s.byteslice(2)# => "2"s.byteslice(200)# => nils.byteslice(4,3)# => "456"s.byteslice(4,30)# => "456789"

Returnsnil iflength is negative oroffset falls outside ofself:

s.byteslice(4,-1)# => nils.byteslice(40,2)# => nil

Counts backwards from the end ofself ifoffset is negative:

s ='0123456789'# => "0123456789"s.byteslice(-4)# => "6"s.byteslice(-4,3)# => "678"

WithRange argumentrange given, returnsbyteslice(range.begin, range.size):

s ='0123456789'# => "0123456789"s.byteslice(4..6)# => "456"s.byteslice(-6..-4)# => "456"s.byteslice(5..2)# => "" # range.size is zero.s.byteslice(40..42)# => nil

The starting and ending offsets need not be on character boundaries:

s ='こんにちは's.byteslice(0,3)# => "こ"s.byteslice(1,3)# => "\x81\x93\xE3"

The encodings ofself and the returned substring are always the same:

s.encoding# => #<Encoding:UTF-8>s.byteslice(0,3).encoding# => #<Encoding:UTF-8>s.byteslice(1,3).encoding# => #<Encoding:UTF-8>

But, depending on the character boundaries, the encoding of the returned substring may not be valid:

s.valid_encoding?# => trues.byteslice(0,3).valid_encoding?# => trues.byteslice(1,3).valid_encoding?# => false

Related: seeConverting to New String.

Source
static VALUErb_str_bytesplice(int argc, VALUE *argv, VALUE str){    long beg, len, vbeg, vlen;    VALUE val;    int cr;    rb_check_arity(argc, 2, 5);    if (!(argc == 2 || argc == 3 || argc == 5)) {        rb_raise(rb_eArgError, "wrong number of arguments (given %d, expected 2, 3, or 5)", argc);    }    if (argc == 2 || (argc == 3 && !RB_INTEGER_TYPE_P(argv[0]))) {        if (!rb_range_beg_len(argv[0], &beg, &len, RSTRING_LEN(str), 2)) {            rb_raise(rb_eTypeError, "wrong argument type %s (expected Range)",                     rb_builtin_class_name(argv[0]));        }        val = argv[1];        StringValue(val);        if (argc == 2) {            /* bytesplice(range, str) */            vbeg = 0;            vlen = RSTRING_LEN(val);        }        else {            /* bytesplice(range, str, str_range) */            if (!rb_range_beg_len(argv[2], &vbeg, &vlen, RSTRING_LEN(val), 2)) {                rb_raise(rb_eTypeError, "wrong argument type %s (expected Range)",                         rb_builtin_class_name(argv[2]));            }        }    }    else {        beg = NUM2LONG(argv[0]);        len = NUM2LONG(argv[1]);        val = argv[2];        StringValue(val);        if (argc == 3) {            /* bytesplice(index, length, str) */            vbeg = 0;            vlen = RSTRING_LEN(val);        }        else {            /* bytesplice(index, length, str, str_index, str_length) */            vbeg = NUM2LONG(argv[3]);            vlen = NUM2LONG(argv[4]);        }    }    str_check_beg_len(str, &beg, &len);    str_check_beg_len(val, &vbeg, &vlen);    str_modify_keep_cr(str);    if (RB_UNLIKELY(ENCODING_GET_INLINED(str) != ENCODING_GET_INLINED(val))) {        rb_enc_associate(str, rb_enc_check(str, val));    }    rb_str_update_1(str, beg, len, val, vbeg, vlen);    cr = ENC_CODERANGE_AND(ENC_CODERANGE(str), ENC_CODERANGE(val));    if (cr != ENC_CODERANGE_BROKEN)        ENC_CODERANGE_SET(str, cr);    return str;}

Replacestarget bytes inself withsource bytes from the given stringstr; returnsself.

In the first form, argumentsoffset andlength determine the target bytes, and the source bytes are all of the givenstr:

'0123456789'.bytesplice(0,3,'abc')# => "abc3456789"'0123456789'.bytesplice(3,3,'abc')# => "012abc6789"'0123456789'.bytesplice(0,50,'abc')# => "abc"'0123456789'.bytesplice(50,3,'abc')# Raises IndexError.

The counts of the target bytes and source source bytes may be different:

'0123456789'.bytesplice(0,6,'abc')# => "abc6789"      # Shorter source.'0123456789'.bytesplice(0,1,'abc')# => "abc123456789" # Shorter target.

And either count may be zero (i.e., specifying an empty string):

'0123456789'.bytesplice(0,3,'')# => "3456789"       # Empty source.'0123456789'.bytesplice(0,0,'abc')# => "abc0123456789" # Empty target.

In the second form, just as in the first, arugmentsoffset andlength determine the target bytes; argumentstrcontains the source bytes, and the additional argumentsstr_offset andstr_length determine the actual source bytes:

'0123456789'.bytesplice(0,3,'abc',0,3)# => "abc3456789"'0123456789'.bytesplice(0,3,'abc',1,1)# => "b3456789"      # Shorter source.'0123456789'.bytesplice(0,1,'abc',0,3)# => "abc123456789"  # Shorter target.'0123456789'.bytesplice(0,3,'abc',1,0)# => "3456789"       # Empty source.'0123456789'.bytesplice(0,0,'abc',0,3)# => "abc0123456789" # Empty target.

In the third form, argumentrange determines the target bytes and the source bytes are all of the givenstr:

'0123456789'.bytesplice(0..2,'abc')# => "abc3456789"'0123456789'.bytesplice(3..5,'abc')# => "012abc6789"'0123456789'.bytesplice(0..5,'abc')# => "abc6789"       # Shorter source.'0123456789'.bytesplice(0..0,'abc')# => "abc123456789"  # Shorter target.'0123456789'.bytesplice(0..2,'')# => "3456789"       # Empty source.'0123456789'.bytesplice(0...0,'abc')# => "abc0123456789" # Empty target.

In the fourth form, just as in the third, arugmentrange determines the target bytes; argumentstrcontains the source bytes, and the additional argumentstr_range determines the actual source bytes:

'0123456789'.bytesplice(0..2,'abc',0..2)# => "abc3456789"'0123456789'.bytesplice(3..5,'abc',0..2)# => "012abc6789"'0123456789'.bytesplice(0..2,'abc',0..1)# => "ab3456789"     # Shorter source.'0123456789'.bytesplice(0..1,'abc',0..2)# => "abc23456789"   # Shorter target.'0123456789'.bytesplice(0..2,'abc',0...0)# => "3456789"       # Empty source.'0123456789'.bytesplice(0...0,'abc',0..2)# => "abc0123456789" # Empty target.

In any of the forms, the beginnings and endings of both source and target must be on character boundaries.

In these examples,self has five 3-byte characters, and so has character boundaries at offsets 0, 3, 6, 9, 12, and 15.

'こんにちは'.bytesplice(0,3,'abc')# => "abcんにちは"'こんにちは'.bytesplice(1,3,'abc')# Raises IndexError.'こんにちは'.bytesplice(0,2,'abc')# Raises IndexError.
Source
static VALUErb_str_capitalize(int argc, VALUE *argv, VALUE str){    rb_encoding *enc;    OnigCaseFoldType flags = ONIGENC_CASE_UPCASE | ONIGENC_CASE_TITLECASE;    VALUE ret;    flags = check_case_options(argc, argv, flags);    enc = str_true_enc(str);    if (RSTRING_LEN(str) == 0 || !RSTRING_PTR(str)) return str;    if (flags&ONIGENC_CASE_ASCII_ONLY) {        ret = rb_str_new(0, RSTRING_LEN(str));        rb_str_ascii_casemap(str, ret, &flags, enc);    }    else {        ret = rb_str_casemap(str, &flags, enc);    }    return ret;}

Returns a string containing the characters inself, each with possibly changed case:

  • The first character is upcased.

  • All other characters are downcased.

Examples:

'hello world'.capitalize# => "Hello world"'HELLO WORLD'.capitalize# => "Hello world"

Some characters do not have upcase and downcase, and so are not changed; seeCase Mapping:

'1, 2, 3, ...'.capitalize# => "1, 2, 3, ..."

The casing is affected by the givenmapping, which may be:ascii,:fold, or:turkic; seeCase Mappings.

Related: seeConverting to New String.

Source
static VALUErb_str_capitalize_bang(int argc, VALUE *argv, VALUE str){    rb_encoding *enc;    OnigCaseFoldType flags = ONIGENC_CASE_UPCASE | ONIGENC_CASE_TITLECASE;    flags = check_case_options(argc, argv, flags);    str_modify_keep_cr(str);    enc = str_true_enc(str);    if (RSTRING_LEN(str) == 0 || !RSTRING_PTR(str)) return Qnil;    if (flags&ONIGENC_CASE_ASCII_ONLY)        rb_str_ascii_casemap(str, str, &flags, enc);    else        str_shared_replace(str, rb_str_casemap(str, &flags, enc));    if (ONIGENC_CASE_MODIFIED&flags) return str;    return Qnil;}

LikeString#capitalize, except that:

  • Changes character casings inself (not in a copy ofself).

  • Returnsself if any changes are made,nil otherwise.

Related: SeeModifying.

Source
static VALUErb_str_casecmp(VALUE str1, VALUE str2){    VALUE s = rb_check_string_type(str2);    if (NIL_P(s)) {        return Qnil;    }    return str_casecmp(str1, s);}

Ignoring case, comparesself andother_string; returns:

  • -1 ifself.downcase is smaller thanother_string.downcase.

  • 0 if the two are equal.

  • 1 ifself.downcase is larger thanother_string.downcase.

  • nil if the two are incomparable.

SeeCase Mapping.

Examples:

'foo'.casecmp('goo')# => -1'goo'.casecmp('foo')# => 1'foo'.casecmp('food')# => -1'food'.casecmp('foo')# => 1'FOO'.casecmp('foo')# => 0'foo'.casecmp('FOO')# => 0'foo'.casecmp(1)# => nil

Related: seeComparing.

Source
static VALUErb_str_casecmp_p(VALUE str1, VALUE str2){    VALUE s = rb_check_string_type(str2);    if (NIL_P(s)) {        return Qnil;    }    return str_casecmp_p(str1, s);}

Returnstrue ifself andother_string are equal after Unicode case folding,false if unequal,nil if incomparable.

SeeCase Mapping.

Examples:

'foo'.casecmp?('goo')# => false'goo'.casecmp?('foo')# => false'foo'.casecmp?('food')# => false'food'.casecmp?('foo')# => false'FOO'.casecmp?('foo')# => true'foo'.casecmp?('FOO')# => true'foo'.casecmp?(1)# => nil

Related: seeComparing.

Source
static VALUErb_str_center(int argc, VALUE *argv, VALUE str){    return rb_str_justify(argc, argv, str, 'c');}

Returns a centered copy ofself.

If integer argumentsize is greater than the size (in characters) ofself, returns a new string of lengthsize that is a copy ofself, centered and padded on one or both ends withpad_string:

'hello'.center(6)# => "hello "               # Padded on one end.'hello'.center(10)# => "  hello   "           # Padded on both ends.'hello'.center(20,'-|')# => "-|-|-|-hello-|-|-|-|" # Some padding repeated.'hello'.center(10,'abcdefg')# => "abhelloabc"           # Some padding not used.'  hello  '.center(13)# => "    hello    "'тест'.center(10)# => "   тест   "'こんにちは'.center(10)# => "  こんにちは   "      # Multi-byte characters.

Ifsize is less than or equal to the size ofself, returns an unpadded copy ofself:

'hello'.center(5)# => "hello"'hello'.center(-10)# => "hello"

Related: seeConverting to New String.

Source
static VALUErb_str_chars(VALUE str){    VALUE ary = WANTARRAY("chars", rb_str_strlen(str));    return rb_str_enumerate_chars(str, ary);}

Returns an array of the characters inself:

'hello'.chars# => ["h", "e", "l", "l", "o"]'тест'.chars# => ["т", "е", "с", "т"]'こんにちは'.chars# => ["こ", "ん", "に", "ち", "は"]''.chars# => []

Related: seeConverting to Non-String.

Source
static VALUErb_str_chomp(int argc, VALUE *argv, VALUE str){    VALUE rs = chomp_rs(argc, argv);    if (NIL_P(rs)) return str_duplicate(rb_cString, str);    return rb_str_subseq(str, 0, chompped_length(str, rs));}

Returns a new string copied fromself, with trailing characters possibly removed:

Whenline_sep is"\n", removes the last one or two characters if they are"\r","\n", or"\r\n" (but not"\n\r"):

$/# => "\n""abc\r".chomp# => "abc""abc\n".chomp# => "abc""abc\r\n".chomp# => "abc""abc\n\r".chomp# => "abc\n""тест\r\n".chomp# => "тест""こんにちは\r\n".chomp# => "こんにちは"

Whenline_sep is'' (an empty string), removes multiple trailing occurrences of"\n" or"\r\n" (but not"\r" or"\n\r"):

"abc\n\n\n".chomp('')# => "abc""abc\r\n\r\n\r\n".chomp('')# => "abc""abc\n\n\r\n\r\n\n\n".chomp('')# => "abc""abc\n\r\n\r\n\r".chomp('')# => "abc\n\r\n\r\n\r""abc\r\r\r".chomp('')# => "abc\r\r\r"

Whenline_sep is neither"\n" nor'', removes a single trailing line separator if there is one:

'abcd'.chomp('cd')# => "ab"'abcdcd'.chomp('cd')# => "abcd"'abcd'.chomp('xx')# => "abcd"

Related: seeConverting to New String.

Source
static VALUErb_str_chomp_bang(int argc, VALUE *argv, VALUE str){    VALUE rs;    str_modifiable(str);    if (RSTRING_LEN(str) == 0 && argc < 2) return Qnil;    rs = chomp_rs(argc, argv);    if (NIL_P(rs)) return Qnil;    return rb_str_chomp_string(str, rs);}

LikeString#chomp, except that:

  • Removes trailing characters fromself (not from a copy ofself).

  • Returnsself if any characters are removed,nil otherwise.

Related: seeModifying.

Source
static VALUErb_str_chop(VALUE str){    return rb_str_subseq(str, 0, chopped_length(str));}

Returns a new string copied fromself, with trailing characters possibly removed.

Removes"\r\n" if those are the last two characters.

"abc\r\n".chop# => "abc""тест\r\n".chop# => "тест""こんにちは\r\n".chop# => "こんにちは"

Otherwise removes the last character if it exists.

'abcd'.chop# => "abc"'тест'.chop# => "тес"'こんにちは'.chop# => "こんにち"''.chop# => ""

If you only need to remove the newline separator at the end of the string,String#chomp is a better alternative.

Related: seeConverting to New String.

Source
static VALUErb_str_chop_bang(VALUE str){    str_modify_keep_cr(str);    if (RSTRING_LEN(str) > 0) {        long len;        len = chopped_length(str);        STR_SET_LEN(str, len);        TERM_FILL(&RSTRING_PTR(str)[len], TERM_LEN(str));        if (ENC_CODERANGE(str) != ENC_CODERANGE_7BIT) {            ENC_CODERANGE_CLEAR(str);        }        return str;    }    return Qnil;}

LikeString#chop, except that:

  • Removes trailing characters fromself (not from a copy ofself).

  • Returnsself if any characters are removed,nil otherwise.

Related: seeModifying.

Source
static VALUErb_str_chr(VALUE str){    return rb_str_substr(str, 0, 1);}

Returns a string containing the first character ofself:

'hello'.chr# => "h"'тест'.chr# => "т"'こんにちは'.chr# => "こ"''.chr# => ""

Related: seeConverting to New String.

Source
static VALUErb_str_clear(VALUE str){    str_discard(str);    STR_SET_EMBED(str);    STR_SET_LEN(str, 0);    RSTRING_PTR(str)[0] = 0;    if (rb_enc_asciicompat(STR_ENC_GET(str)))        ENC_CODERANGE_SET(str, ENC_CODERANGE_7BIT);    else        ENC_CODERANGE_SET(str, ENC_CODERANGE_VALID);    return str;}

Removes the contents ofself:

s ='foo's.clear# => ""s# => ""

Related: seeModifying.

Source
static VALUErb_str_codepoints(VALUE str){    VALUE ary = WANTARRAY("codepoints", rb_str_strlen(str));    return rb_str_enumerate_codepoints(str, ary);}

Returns an array of the codepoints inself; each codepoint is the integer value for a character:

'hello'.codepoints# => [104, 101, 108, 108, 111]'тест'.codepoints# => [1090, 1077, 1089, 1090]'こんにちは'.codepoints# => [12371, 12435, 12395, 12385, 12399]''.codepoints# => []

Related: seeConverting to Non-String.

Source
static VALUErb_str_concat_multi(int argc, VALUE *argv, VALUE str){    str_modifiable(str);    if (argc == 1) {        return rb_str_concat(str, argv[0]);    }    else if (argc > 1) {        int i;        VALUE arg_str = rb_str_tmp_new(0);        rb_enc_copy(arg_str, str);        for (i = 0; i < argc; i++) {            rb_str_concat(arg_str, argv[i]);        }        rb_str_buf_append(str, arg_str);    }    return str;}

Concatenates each object inobjects toself; returnsself:

'foo'.concat('bar','baz')# => "foobarbaz"

For each given objectobject that is an integer, the value is considered a codepoint and converted to a character before concatenation:

'foo'.concat(32,'bar',32,'baz')# => "foo bar baz" # Embeds spaces.'те'.concat(1089,1090)# => "тест"'こん'.concat(12395,12385,12399)# => "こんにちは"

Related: seeConverting to New String.

Source
static VALUErb_str_count(int argc, VALUE *argv, VALUE str){    char table[TR_TABLE_SIZE];    rb_encoding *enc = 0;    VALUE del = 0, nodel = 0, tstr;    char *s, *send;    int i;    int ascompat;    size_t n = 0;    rb_check_arity(argc, 1, UNLIMITED_ARGUMENTS);    tstr = argv[0];    StringValue(tstr);    enc = rb_enc_check(str, tstr);    if (argc == 1) {        const char *ptstr;        if (RSTRING_LEN(tstr) == 1 && rb_enc_asciicompat(enc) &&            (ptstr = RSTRING_PTR(tstr),             ONIGENC_IS_ALLOWED_REVERSE_MATCH(enc, (const unsigned char *)ptstr, (const unsigned char *)ptstr+1)) &&            !is_broken_string(str)) {            int clen;            unsigned char c = rb_enc_codepoint_len(ptstr, ptstr+1, &clen, enc);            s = RSTRING_PTR(str);            if (!s || RSTRING_LEN(str) == 0) return INT2FIX(0);            send = RSTRING_END(str);            while (s < send) {                if (*(unsigned char*)s++ == c) n++;            }            return SIZET2NUM(n);        }    }    tr_setup_table(tstr, table, TRUE, &del, &nodel, enc);    for (i=1; i<argc; i++) {        tstr = argv[i];        StringValue(tstr);        enc = rb_enc_check(str, tstr);        tr_setup_table(tstr, table, FALSE, &del, &nodel, enc);    }    s = RSTRING_PTR(str);    if (!s || RSTRING_LEN(str) == 0) return INT2FIX(0);    send = RSTRING_END(str);    ascompat = rb_enc_asciicompat(enc);    while (s < send) {        unsigned int c;        if (ascompat && (c = *(unsigned char*)s) < 0x80) {            if (table[c]) {                n++;            }            s++;        }        else {            int clen;            c = rb_enc_codepoint_len(s, send, &clen, enc);            if (tr_find(c, table, del, nodel)) {                n++;            }            s += clen;        }    }    return SIZET2NUM(n);}

Returns the total number of characters inself that are specified by the given selectors.

For one 1-character selector, returns the count of instances of that character:

s ='abracadabra's.count('a')# => 5s.count('b')# => 2s.count('x')# => 0s.count('')# => 0s ='тест's.count('т')# => 2s.count('е')# => 1s ='よろしくお願いします's.count('よ')# => 1s.count('し')# => 2

For one multi-character selector, returns the count of instances for all specified characters:

s ='abracadabra's.count('ab')# => 7s.count('abc')# => 8s.count('abcd')# => 9s.count('abcdr')# => 11s.count('abcdrx')# => 11

Order and repetition do not matter:

s.count('ba')==s.count('ab')# => trues.count('baab')==s.count('ab')# => true

For multiple selectors, forms a single selector that is the intersection of characters in all selectors and returns the count of instances for that selector:

s ='abcdefg's.count('abcde','dcbfg')==s.count('bcd')# => trues.count('abc','def')==s.count('')# => true

In a character selector, three characters get special treatment:

  • A caret ('^') functions as anegation operator for the immediately following characters:

    s ='abracadabra's.count('^bc')# => 8  # Count of all except 'b' and 'c'.
  • A hyphen ('-') between two other characters defines arange of characters:

    s ='abracadabra's.count('a-c')# => 8  # Count of all 'a', 'b', and 'c'.
  • A backslash ('\') acts as an escape for a caret, a hyphen, or another backslash:

    s ='abracadabra's.count('\^bc')# => 3  # Count of '^', 'b', and 'c'.s.count('a\-c')# => 6  # Count of 'a', '-', and 'c'.'foo\bar\baz'.count('\\')# => 2  # Count of '\'.

These usages may be mixed:

s ='abracadabra's.count('a-cq-t')# => 10  # Multiple ranges.s.count('ac-d')# => 7   # Range mixed with plain characters.s.count('^a-c')# => 3   # Range mixed with negation.

For multiple selectors, all forms may be used, including negations, ranges, and escapes.

s ='abracadabra's.count('^abc','^def')==s.count('^abcdef')# => trues.count('a-e','c-g')==s.count('cde')# => trues.count('^abc','c-g')==s.count('defg')# => true

Related: seeQuerying.

Source
static VALUErb_str_crypt(VALUE str, VALUE salt){#ifdef HAVE_CRYPT_R    VALUE databuf;    struct crypt_data *data;#   define CRYPT_END() ALLOCV_END(databuf)#else    char *tmp_buf;    extern char *crypt(const char *, const char *);#   define CRYPT_END() rb_nativethread_lock_unlock(&crypt_mutex.lock)#endif    VALUE result;    const char *s, *saltp;    char *res;#ifdef BROKEN_CRYPT    char salt_8bit_clean[3];#endif    StringValue(salt);    mustnot_wchar(str);    mustnot_wchar(salt);    s = StringValueCStr(str);    saltp = RSTRING_PTR(salt);    if (RSTRING_LEN(salt) < 2 || !saltp[0] || !saltp[1]) {        rb_raise(rb_eArgError, "salt too short (need >=2 bytes)");    }#ifdef BROKEN_CRYPT    if (!ISASCII((unsigned char)saltp[0]) || !ISASCII((unsigned char)saltp[1])) {        salt_8bit_clean[0] = saltp[0] & 0x7f;        salt_8bit_clean[1] = saltp[1] & 0x7f;        salt_8bit_clean[2] = '\0';        saltp = salt_8bit_clean;    }#endif#ifdef HAVE_CRYPT_R    data = ALLOCV(databuf, sizeof(struct crypt_data));# ifdef HAVE_STRUCT_CRYPT_DATA_INITIALIZED    data->initialized = 0;# endif    res = crypt_r(s, saltp, data);#else    rb_nativethread_lock_lock(&crypt_mutex.lock);    res = crypt(s, saltp);#endif    if (!res) {        int err = errno;        CRYPT_END();        rb_syserr_fail(err, "crypt");    }#ifdef HAVE_CRYPT_R    result = rb_str_new_cstr(res);    CRYPT_END();#else    // We need to copy this buffer because it's static and we need to unlock the mutex    // before allocating a new object (the string to be returned). If we allocate while    // holding the lock, we could run GC which fires the VM barrier and causes a deadlock    // if other ractors are waiting on this lock.    size_t res_size = strlen(res)+1;    tmp_buf = ALLOCA_N(char, res_size); // should be small enough to alloca    memcpy(tmp_buf, res, res_size);    res = tmp_buf;    CRYPT_END();    result = rb_str_new_cstr(res);#endif    return result;}

Returns the string generated by callingcrypt(3) standard library function withstr andsalt_str, in this order, as its arguments. Please do not use this method any longer. It is legacy; provided only for backward compatibility with ruby scripts in earlier days. It is bad to use in contemporary programs for several reasons:

  • Behaviour of C’scrypt(3) depends on the OS it is run. The generated string lacks data portability.

  • On some OSes such as Mac OS,crypt(3) never fails (i.e. silently ends up in unexpected results).

  • On some OSes such as Mac OS,crypt(3) is not thread safe.

  • So-called “traditional” usage ofcrypt(3) is very very very weak. According to its manpage, Linux’s traditionalcrypt(3) output has only 2**56 variations; too easy to brute force today. And this is the default behaviour.

  • In order to make things robust some OSes implement so-called “modular” usage. To go through, you have to do a complex build-up of thesalt_str parameter, by hand. Failure in generation of a proper salt string tends not to yield any errors; typos in parameters are normally not detectable.

    • For instance, in the following example, the second invocation ofString#crypt is wrong; it has a typo in “round=” (lacks “s”). However the call does not fail and something unexpected is generated.

      "foo".crypt("$5$rounds=1000$salt$")# OK, proper usage"foo".crypt("$5$round=1000$salt$")# Typo not detected
  • Even in the “modular” mode, some hash functions are considered archaic and no longer recommended at all; for instance module$1$ is officially abandoned by its author: seephk.freebsd.dk/sagas/md5crypt_eol/ . For another instance module$3$ is considered completely broken: see the manpage of FreeBSD.

  • On some OS such as Mac OS, there is no modular mode. Yet, as written above,crypt(3) on Mac OS never fails. This means even if you build up a proper salt string it generates a traditional DES hash anyways, and there is no way for you to be aware of.

    "foo".crypt("$5$rounds=1000$salt$")# => "$5fNPQMxC5j6."

If for some reason you cannot migrate to other secure contemporary password hashing algorithms, install the string-crypt gem andrequire 'string/crypt' to continue using it.

Returns a frozen string equal toself.

The returned string isself if and only if all of the following are true:

  • self is already frozen.

  • self is an instance of String (rather than of a subclass of String)

  • self has no instance variables set on it.

Otherwise, the returned string is a frozen copy ofself.

Returningself, when possible, saves duplicatingself; seeData deduplication.

It may also save duplicating other, already-existing, strings:

s0 ='foo's1 ='foo's0.object_id==s1.object_id# => false(-s0).object_id== (-s1).object_id# => true

Note that method-@ is convenient for defining a constant:

FileName =-'config/database.yml'

While its aliasdedup is better suited for chaining:

'foo'.dedup.gsub!('o')

Related: seeFreezing/Unfreezing.

Alias for:-@
Source
static VALUErb_str_delete(int argc, VALUE *argv, VALUE str){    str = str_duplicate(rb_cString, str);    rb_str_delete_bang(argc, argv, str);    return str;}

Returns a new string that is a copy ofself with certain characters removed; the removed characters are all instances of those specified by the given stringselectors.

For one 1-character selector, removes all instances of that character:

s ='abracadabra's.delete('a')# => "brcdbr"s.delete('b')# => "aracadara"s.delete('x')# => "abracadabra"s.delete('')# => "abracadabra"s ='тест's.delete('т')# => "ес"s.delete('е')# => "тст"s ='よろしくお願いします's.delete('よ')# => "ろしくお願いします"s.delete('し')# => "よろくお願います"

For one multi-character selector, removes all instances of the specified characters:

s ='abracadabra's.delete('ab')# => "rcdr"s.delete('abc')# => "rdr"s.delete('abcd')# => "rr"s.delete('abcdr')# => ""s.delete('abcdrx')# => ""

Order and repetition do not matter:

s.delete('ba')==s.delete('ab')# => trues.delete('baab')==s.delete('ab')# => true

For multiple selectors, forms a single selector that is the intersection of characters in all selectors and removes all instances of characters specified by that selector:

s ='abcdefg's.delete('abcde','dcbfg')==s.delete('bcd')# => trues.delete('abc','def')==s.delete('')# => true

In a character selector, three characters get special treatment:

  • A caret ('^') functions as anegation operator for the immediately following characters:

    s ='abracadabra's.delete('^bc')# => "bcb"  # Deletes all except 'b' and 'c'.
  • A hyphen ('-') between two other characters defines arange of characters:

    s ='abracadabra's.delete('a-c')# => "rdr"  # Deletes all 'a', 'b', and 'c'.
  • A backslash ('\') acts as an escape for a caret, a hyphen, or another backslash:

    s ='abracadabra's.delete('\^bc')# => "araadara"   # Deletes all '^', 'b', and 'c'.s.delete('a\-c')# => "brdbr"      # Deletes all 'a', '-', and 'c'.'foo\bar\baz'.delete('\\')# => "foobarbaz"  # Deletes all '\'.

These usages may be mixed:

s ='abracadabra's.delete('a-cq-t')# => "d"         # Multiple ranges.s.delete('ac-d')# => "brbr"      # Range mixed with plain characters.s.delete('^a-c')# => "abacaaba"  # Range mixed with negation.

For multiple selectors, all forms may be used, including negations, ranges, and escapes.

s ='abracadabra's.delete('^abc','^def')==s.delete('^abcdef')# => trues.delete('a-e','c-g')==s.delete('cde')# => trues.delete('^abc','c-g')==s.delete('defg')# => true

Related: seeConverting to New String.

Source
static VALUErb_str_delete_bang(int argc, VALUE *argv, VALUE str){    char squeez[TR_TABLE_SIZE];    rb_encoding *enc = 0;    char *s, *send, *t;    VALUE del = 0, nodel = 0;    int modify = 0;    int i, ascompat, cr;    if (RSTRING_LEN(str) == 0 || !RSTRING_PTR(str)) return Qnil;    rb_check_arity(argc, 1, UNLIMITED_ARGUMENTS);    for (i=0; i<argc; i++) {        VALUE s = argv[i];        StringValue(s);        enc = rb_enc_check(str, s);        tr_setup_table(s, squeez, i==0, &del, &nodel, enc);    }    str_modify_keep_cr(str);    ascompat = rb_enc_asciicompat(enc);    s = t = RSTRING_PTR(str);    send = RSTRING_END(str);    cr = ascompat ? ENC_CODERANGE_7BIT : ENC_CODERANGE_VALID;    while (s < send) {        unsigned int c;        int clen;        if (ascompat && (c = *(unsigned char*)s) < 0x80) {            if (squeez[c]) {                modify = 1;            }            else {                if (t != s) *t = c;                t++;            }            s++;        }        else {            c = rb_enc_codepoint_len(s, send, &clen, enc);            if (tr_find(c, squeez, del, nodel)) {                modify = 1;            }            else {                if (t != s) rb_enc_mbcput(c, t, enc);                t += clen;                if (cr == ENC_CODERANGE_7BIT) cr = ENC_CODERANGE_VALID;            }            s += clen;        }    }    TERM_FILL(t, TERM_LEN(str));    STR_SET_LEN(str, t - RSTRING_PTR(str));    ENC_CODERANGE_SET(str, cr);    if (modify) return str;    return Qnil;}

LikeString#delete, but modifiesself in place; returnsself if any characters were deleted,nil otherwise.

Related: seeModifying.

Source
static VALUErb_str_delete_prefix(VALUE str, VALUE prefix){    long prefixlen;    prefixlen = deleted_prefix_length(str, prefix);    if (prefixlen <= 0) return str_duplicate(rb_cString, str);    return rb_str_subseq(str, prefixlen, RSTRING_LEN(str) - prefixlen);}

Returns a copy ofself with leading substringprefix removed:

'oof'.delete_prefix('o')# => "of"'oof'.delete_prefix('oo')# => "f"'oof'.delete_prefix('oof')# => ""'oof'.delete_prefix('x')# => "oof"'тест'.delete_prefix('те')# => "ст"'こんにちは'.delete_prefix('こん')# => "にちは"

Related: seeConverting to New String.

Source
static VALUErb_str_delete_prefix_bang(VALUE str, VALUE prefix){    long prefixlen;    str_modify_keep_cr(str);    prefixlen = deleted_prefix_length(str, prefix);    if (prefixlen <= 0) return Qnil;    return rb_str_drop_bytes(str, prefixlen);}

LikeString#delete_prefix, except thatself is modified in place; returnsself if the prefix is removed,nil otherwise.

Related: seeModifying.

Source
static VALUErb_str_delete_suffix(VALUE str, VALUE suffix){    long suffixlen;    suffixlen = deleted_suffix_length(str, suffix);    if (suffixlen <= 0) return str_duplicate(rb_cString, str);    return rb_str_subseq(str, 0, RSTRING_LEN(str) - suffixlen);}

Returns a copy ofself with trailing substringsuffix removed:

'foo'.delete_suffix('o')# => "fo"'foo'.delete_suffix('oo')# => "f"'foo'.delete_suffix('foo')# => ""'foo'.delete_suffix('f')# => "foo"'foo'.delete_suffix('x')# => "foo"'тест'.delete_suffix('ст')# => "те"'こんにちは'.delete_suffix('ちは')# => "こんに"

Related: seeConverting to New String.

Source
static VALUErb_str_delete_suffix_bang(VALUE str, VALUE suffix){    long olen, suffixlen, len;    str_modifiable(str);    suffixlen = deleted_suffix_length(str, suffix);    if (suffixlen <= 0) return Qnil;    olen = RSTRING_LEN(str);    str_modify_keep_cr(str);    len = olen - suffixlen;    STR_SET_LEN(str, len);    TERM_FILL(&RSTRING_PTR(str)[len], TERM_LEN(str));    if (ENC_CODERANGE(str) != ENC_CODERANGE_7BIT) {        ENC_CODERANGE_CLEAR(str);    }    return str;}

LikeString#delete_suffix, except thatself is modified in place; returnsself if the suffix is removed,nil otherwise.

Related: seeModifying.

Source
static VALUErb_str_downcase(int argc, VALUE *argv, VALUE str){    rb_encoding *enc;    OnigCaseFoldType flags = ONIGENC_CASE_DOWNCASE;    VALUE ret;    flags = check_case_options(argc, argv, flags);    enc = str_true_enc(str);    if (case_option_single_p(flags, enc, str)) {        ret = rb_str_new(RSTRING_PTR(str), RSTRING_LEN(str));        str_enc_copy_direct(ret, str);        downcase_single(ret);    }    else if (flags&ONIGENC_CASE_ASCII_ONLY) {        ret = rb_str_new(0, RSTRING_LEN(str));        rb_str_ascii_casemap(str, ret, &flags, enc);    }    else {        ret = rb_str_casemap(str, &flags, enc);    }    return ret;}

Returns a new string containing the downcased characters inself:

'Hello, World!'.downcase# => "hello, world!"'ТЕСТ'.downcase# => "тест"'よろしくお願いします'.downcase# => "よろしくお願いします"

Some characters do not have upcased and downcased versions.

The casing may be affected by the givenmapping; seeCase Mapping.

Related: seeConverting to New String.

Source
static VALUErb_str_downcase_bang(int argc, VALUE *argv, VALUE str){    rb_encoding *enc;    OnigCaseFoldType flags = ONIGENC_CASE_DOWNCASE;    flags = check_case_options(argc, argv, flags);    str_modify_keep_cr(str);    enc = str_true_enc(str);    if (case_option_single_p(flags, enc, str)) {        if (downcase_single(str))            flags |= ONIGENC_CASE_MODIFIED;    }    else if (flags&ONIGENC_CASE_ASCII_ONLY)        rb_str_ascii_casemap(str, str, &flags, enc);    else        str_shared_replace(str, rb_str_casemap(str, &flags, enc));    if (ONIGENC_CASE_MODIFIED&flags) return str;    return Qnil;}

LikeString#downcase, except that:

  • Changes character casings inself (not in a copy ofself).

  • Returnsself if any changes are made,nil otherwise.

Related: SeeModifying.

Source
VALUErb_str_dump(VALUE str){    int encidx = rb_enc_get_index(str);    rb_encoding *enc = rb_enc_from_index(encidx);    long len;    const char *p, *pend;    char *q, *qend;    VALUE result;    int u8 = (encidx == rb_utf8_encindex());    static const char nonascii_suffix[] = ".dup.force_encoding(\"%s\")";    len = 2;                    /* "" */    if (!rb_enc_asciicompat(enc)) {        len += strlen(nonascii_suffix) - rb_strlen_lit("%s");        len += strlen(enc->name);    }    p = RSTRING_PTR(str); pend = p + RSTRING_LEN(str);    while (p < pend) {        int clen;        unsigned char c = *p++;        switch (c) {          case '"':  case '\\':          case '\n': case '\r':          case '\t': case '\f':          case '\013': case '\010': case '\007': case '\033':            clen = 2;            break;          case '#':            clen = IS_EVSTR(p, pend) ? 2 : 1;            break;          default:            if (ISPRINT(c)) {                clen = 1;            }            else {                if (u8 && c > 0x7F) {   /* \u notation */                    int n = rb_enc_precise_mbclen(p-1, pend, enc);                    if (MBCLEN_CHARFOUND_P(n)) {                        unsigned int cc = rb_enc_mbc_to_codepoint(p-1, pend, enc);                        if (cc <= 0xFFFF)                            clen = 6;  /* \uXXXX */                        else if (cc <= 0xFFFFF)                            clen = 9;  /* \u{XXXXX} */                        else                            clen = 10; /* \u{XXXXXX} */                        p += MBCLEN_CHARFOUND_LEN(n)-1;                        break;                    }                }                clen = 4;       /* \xNN */            }            break;        }        if (clen > LONG_MAX - len) {            rb_raise(rb_eRuntimeError, "string size too big");        }        len += clen;    }    result = rb_str_new(0, len);    p = RSTRING_PTR(str); pend = p + RSTRING_LEN(str);    q = RSTRING_PTR(result); qend = q + len + 1;    *q++ = '"';    while (p < pend) {        unsigned char c = *p++;        if (c == '"' || c == '\\') {            *q++ = '\\';            *q++ = c;        }        else if (c == '#') {            if (IS_EVSTR(p, pend)) *q++ = '\\';            *q++ = '#';        }        else if (c == '\n') {            *q++ = '\\';            *q++ = 'n';        }        else if (c == '\r') {            *q++ = '\\';            *q++ = 'r';        }        else if (c == '\t') {            *q++ = '\\';            *q++ = 't';        }        else if (c == '\f') {            *q++ = '\\';            *q++ = 'f';        }        else if (c == '\013') {            *q++ = '\\';            *q++ = 'v';        }        else if (c == '\010') {            *q++ = '\\';            *q++ = 'b';        }        else if (c == '\007') {            *q++ = '\\';            *q++ = 'a';        }        else if (c == '\033') {            *q++ = '\\';            *q++ = 'e';        }        else if (ISPRINT(c)) {            *q++ = c;        }        else {            *q++ = '\\';            if (u8) {                int n = rb_enc_precise_mbclen(p-1, pend, enc) - 1;                if (MBCLEN_CHARFOUND_P(n)) {                    int cc = rb_enc_mbc_to_codepoint(p-1, pend, enc);                    p += n;                    if (cc <= 0xFFFF)                        snprintf(q, qend-q, "u%04X", cc);    /* \uXXXX */                    else                        snprintf(q, qend-q, "u{%X}", cc);  /* \u{XXXXX} or \u{XXXXXX} */                    q += strlen(q);                    continue;                }            }            snprintf(q, qend-q, "x%02X", c);            q += 3;        }    }    *q++ = '"';    *q = '\0';    if (!rb_enc_asciicompat(enc)) {        snprintf(q, qend-q, nonascii_suffix, enc->name);        encidx = rb_ascii8bit_encindex();    }    /* result from dump is ASCII */    rb_enc_associate_index(result, encidx);    ENC_CODERANGE_SET(result, ENC_CODERANGE_7BIT);    return result;}

Returns a printable version ofself, enclosed in double-quotes:

'hello'.dump# => "\"hello\""

Certain special characters are rendered with escapes:

'"'.dump# => "\"\\\"\""'\\'.dump# => "\"\\\\\""

Non-printing characters are rendered with escapes:

s =''s<<7# Alarm (bell).s<<8# Back space.s<<9# Horizontal tab.s<<10# Line feed.s<<11# Vertical tab.s<<12# Form feed.s<<13# Carriage return.s# => "\a\b\t\n\v\f\r"s.dump# => "\"\\a\\b\\t\\n\\v\\f\\r\""

Ifself is encoded in UTF-8 and contains Unicode characters, renders Unicode characters in Unicode escape sequence:

'тест'.dump# => "\"\\u0442\\u0435\\u0441\\u0442\""'こんにちは'.dump# => "\"\\u3053\\u3093\\u306B\\u3061\\u306F\""

If the encoding ofself is not ASCII-compatible (i.e.,self.encoding.ascii_compatible? returnsfalse), renders all ASCII-compatible bytes as ASCII characters and all other bytes as hexadecimal. Appends.dup.force_encoding(\"encoding\"), where<encoding> isself.encoding.name:

s ='hello's.encoding# => #<Encoding:UTF-8>s.dump# => "\"hello\""s.encode('utf-16').dump# => "\"\\xFE\\xFF\\x00h\\x00e\\x00l\\x00l\\x00o\".dup.force_encoding(\"UTF-16\")"s.encode('utf-16le').dump# => "\"h\\x00e\\x00l\\x00l\\x00o\\x00\".dup.force_encoding(\"UTF-16LE\")"s ='тест's.encoding# => #<Encoding:UTF-8>s.dump# => "\"\\u0442\\u0435\\u0441\\u0442\""s.encode('utf-16').dump# => "\"\\xFE\\xFF\\x04B\\x045\\x04A\\x04B\".dup.force_encoding(\"UTF-16\")"s.encode('utf-16le').dump# => "\"B\\x045\\x04A\\x04B\\x04\".dup.force_encoding(\"UTF-16LE\")"s ='こんにちは's.encoding# => #<Encoding:UTF-8>s.dump# => "\"\\u3053\\u3093\\u306B\\u3061\\u306F\""s.encode('utf-16').dump# => "\"\\xFE\\xFF0S0\\x930k0a0o\".dup.force_encoding(\"UTF-16\")"s.encode('utf-16le').dump# => "\"S0\\x930k0a0o0\".dup.force_encoding(\"UTF-16LE\")"

Related: seeConverting to New String.

Source
static VALUErb_str_each_byte(VALUE str){    RETURN_SIZED_ENUMERATOR(str, 0, 0, rb_str_each_byte_size);    return rb_str_enumerate_bytes(str, 0);}

With a block given, calls the block with each successive byte fromself; returnsself:

a = []'hello'.each_byte {|byte|a.push(byte) }# Five 1-byte characters.a# => [104, 101, 108, 108, 111]a = []'тест'.each_byte {|byte|a.push(byte) }# Four 2-byte characters.a# => [209, 130, 208, 181, 209, 129, 209, 130]a = []'こんにちは'.each_byte {|byte|a.push(byte) }# Five 3-byte characters.a# => [227, 129, 147, 227, 130, 147, 227, 129, 171, 227, 129, 161, 227, 129, 175]

With no block given, returns an enumerator.

Related: seeIterating.

Source
static VALUErb_str_each_char(VALUE str){    RETURN_SIZED_ENUMERATOR(str, 0, 0, rb_str_each_char_size);    return rb_str_enumerate_chars(str, 0);}

With a block given, calls the block with each successive character fromself; returnsself:

a = []'hello'.each_chardo|char|a.push(char)enda# => ["h", "e", "l", "l", "o"]a = []'тест'.each_chardo|char|a.push(char)enda# => ["т", "е", "с", "т"]a = []'こんにちは'.each_chardo|char|a.push(char)enda# => ["こ", "ん", "に", "ち", "は"]

With no block given, returns an enumerator.

Related: seeIterating.

Source
static VALUErb_str_each_codepoint(VALUE str){    RETURN_SIZED_ENUMERATOR(str, 0, 0, rb_str_each_char_size);    return rb_str_enumerate_codepoints(str, 0);}

With a block given, calls the block with each successive codepoint fromself; eachcodepoint is the integer value for a character; returnsself:

a = []'hello'.each_codepointdo|codepoint|a.push(codepoint)enda# => [104, 101, 108, 108, 111]a = []'тест'.each_codepointdo|codepoint|a.push(codepoint)enda# => [1090, 1077, 1089, 1090]a = []'こんにちは'.each_codepointdo|codepoint|a.push(codepoint)enda# => [12371, 12435, 12395, 12385, 12399]

With no block given, returns an enumerator.

Related: seeIterating.

Source
static VALUErb_str_each_grapheme_cluster(VALUE str){    RETURN_SIZED_ENUMERATOR(str, 0, 0, rb_str_each_grapheme_cluster_size);    return rb_str_enumerate_grapheme_clusters(str, 0);}

With a block given, calls the given block with each successive grapheme cluster fromself (seeUnicode Grapheme Cluster Boundaries); returnsself:

a = []'hello'.each_grapheme_clusterdo|grapheme_cluster|a.push(grapheme_cluster)enda# => ["h", "e", "l", "l", "o"]a = []'тест'.each_grapheme_clusterdo|grapheme_cluster|a.push(grapheme_cluster)enda# => ["т", "е", "с", "т"]a = []'こんにちは'.each_grapheme_clusterdo|grapheme_cluster|a.push(grapheme_cluster)enda# => ["こ", "ん", "に", "ち", "は"]

With no block given, returns an enumerator.

Related: seeIterating.

Source
static VALUErb_str_each_line(int argc, VALUE *argv, VALUE str){    RETURN_SIZED_ENUMERATOR(str, argc, argv, 0);    return rb_str_enumerate_lines(argc, argv, str, 0);}

With a block given, forms the substrings (lines) that are the result of splittingself at each occurrence of the givenrecord_separator; passes each line to the block; returnsself.

With the defaultrecord_separator:

$/# => "\n"s =<<~EOTThis is the first line.This is line two.This is line four.This is line five.EOTs.each_line {|line|pline }

Output:

"This is the first line.\n""This is line two.\n""\n""This is line four.\n""This is line five.\n"

With a differentrecord_separator:

record_separator =' is 's.each_line(record_separator) {|line|pline }

Output:

"This is ""the first line.\nThis is ""line two.\n\nThis is ""line four.\nThis is ""line five.\n"

Withchomp astrue, removes the trailingrecord_separator from each line:

s.each_line(chomp:true) {|line|pline }

Output:

"This is the first line.""This is line two.""""This is line four.""This is line five."

With an empty string asrecord_separator, forms and passes “paragraphs” by splitting at each occurrence of two or more newlines:

record_separator =''s.each_line(record_separator) {|line|pline }

Output:

"This is the first line.\nThis is line two.\n\n""This is line four.\nThis is line five.\n"

With no block given, returns an enumerator.

Related: seeIterating.

Source
static VALUErb_str_empty(VALUE str){    return RBOOL(RSTRING_LEN(str) == 0);}

Returns whether the length ofself is zero:

'hello'.empty?# => false' '.empty?# => false''.empty?# => true

Related: seeQuerying.

Source
static VALUEstr_encode(int argc, VALUE *argv, VALUE str){    VALUE newstr = str;    int encidx = str_transcode(argc, argv, &newstr);    return encoded_dup(newstr, str, encidx);}

Returns a copy ofself transcoded as determined bydst_encoding; seeEncodings.

By default, raises an exception ifself contains an invalid byte or a character not defined indst_encoding; that behavior may be modified by encoding options; see below.

With no arguments:

  • Uses the same encoding ifEncoding.default_internal isnil (the default):

    Encoding.default_internal# => nils ="Ruby\x99".force_encoding('Windows-1252')s.encoding# => #<Encoding:Windows-1252>s.bytes# => [82, 117, 98, 121, 153]t =s.encode# => "Ruby\x99"t.encoding# => #<Encoding:Windows-1252>t.bytes# => [82, 117, 98, 121, 226, 132, 162]
  • Otherwise, uses the encodingEncoding.default_internal:

    Encoding.default_internal ='UTF-8't =s.encode# => "Ruby™"t.encoding# => #<Encoding:UTF-8>

With only argumentdst_encoding given, uses that encoding:

s ="Ruby\x99".force_encoding('Windows-1252')s.encoding# => #<Encoding:Windows-1252>t =s.encode('UTF-8')# => "Ruby™"t.encoding# => #<Encoding:UTF-8>

With argumentsdst_encoding andsrc_encoding given, interpretsself usingsrc_encoding, encodes the new string usingdst_encoding:

s ="Ruby\x99"t =s.encode('UTF-8','Windows-1252')# => "Ruby™"t.encoding# => #<Encoding:UTF-8>

Optional keyword argumentsenc_opts specify encoding options; seeEncoding Options.

Please note that, unlessinvalid: :replace option is given, conversion from an encodingenc to the same encodingenc (independent of whetherenc is given explicitly or implicitly) is a no-op, i.e. the string is simply copied without any changes, and no exceptions are raised, even if there are invalid bytes.

Related: seeConverting to New String.

Source
static VALUEstr_encode_bang(int argc, VALUE *argv, VALUE str){    VALUE newstr;    int encidx;    rb_check_frozen(str);    newstr = str;    encidx = str_transcode(argc, argv, &newstr);    if (encidx < 0) return str;    if (newstr == str) {        rb_enc_associate_index(str, encidx);        return str;    }    rb_str_shared_replace(str, newstr);    return str_encode_associate(str, encidx);}

Likeencode, but applies encoding changes toself; returnsself.

Related: seeModifying.

Source
VALUErb_obj_encoding(VALUE obj){    int idx = rb_enc_get_index(obj);    if (idx < 0) {        rb_raise(rb_eTypeError, "unknown encoding");    }    return rb_enc_from_encoding_index(idx & ENC_INDEX_MASK);}

Returns anEncoding object that represents the encoding ofself; seeEncodings.

Related: seeQuerying.

Source
static VALUErb_str_end_with(int argc, VALUE *argv, VALUE str){    int i;    for (i=0; i<argc; i++) {        VALUE tmp = argv[i];        const char *p, *s, *e;        long slen, tlen;        rb_encoding *enc;        StringValue(tmp);        enc = rb_enc_check(str, tmp);        if ((tlen = RSTRING_LEN(tmp)) == 0) return Qtrue;        if ((slen = RSTRING_LEN(str)) < tlen) continue;        p = RSTRING_PTR(str);        e = p + slen;        s = e - tlen;        if (!at_char_boundary(p, s, e, enc))            continue;        if (memcmp(s, RSTRING_PTR(tmp), tlen) == 0)            return Qtrue;    }    return Qfalse;}

Returns whetherself ends with any of the givenstrings:

'foo'.end_with?('oo')# => true'foo'.end_with?('bar','oo')# => true'foo'.end_with?('bar','baz')# => false'foo'.end_with?('')# => true'тест'.end_with?('т')# => true'こんにちは'.end_with?('は')# => true

Related: seeQuerying.

Source
VALUErb_str_eql(VALUE str1, VALUE str2){    if (str1 == str2) return Qtrue;    if (!RB_TYPE_P(str2, T_STRING)) return Qfalse;    return rb_str_eql_internal(str1, str2);}

Returns whetherself andobject have the same length and content:

s ='foo's.eql?('foo')# => trues.eql?('food')# => falses.eql?('FOO')# => false

Returnsfalse if the two strings’ encodings are not compatible:

s0 ="äöü"# => "äöü"s1 =s0.encode(Encoding::ISO_8859_1)# => "\xE4\xF6\xFC"s0.encoding# => #<Encoding:UTF-8>s1.encoding# => #<Encoding:ISO-8859-1>s0.eql?(s1)# => false

SeeEncodings.

Related: seeQuerying.

Source
static VALUErb_str_force_encoding(VALUE str, VALUE enc){    str_modifiable(str);    rb_encoding *encoding = rb_to_encoding(enc);    int idx = rb_enc_to_index(encoding);    // If the encoding is unchanged, we do nothing.    if (ENCODING_GET(str) == idx) {        return str;    }    rb_enc_associate_index(str, idx);    // If the coderange was 7bit and the new encoding is ASCII-compatible    // we can keep the coderange.    if (ENC_CODERANGE(str) == ENC_CODERANGE_7BIT && encoding && rb_enc_asciicompat(encoding)) {        return str;    }    ENC_CODERANGE_CLEAR(str);    return str;}

Changes the encoding ofself to the givenencoding, which may be a string encoding name or anEncoding object; does not change the underlying bytes; returns self:

s ='łał's.bytes# => [197, 130, 97, 197, 130]s.encoding# => #<Encoding:UTF-8>s.force_encoding('ascii')# => "\xC5\x82a\xC5\x82"s.encoding# => #<Encoding:US-ASCII>s.valid_encoding?# => trues.bytes# => [197, 130, 97, 197, 130]

Makes the change even if the givenencoding is invalid forself (as is the change above):

s.valid_encoding?# => false

SeeEncodings.

Related: seeModifying.

Source
VALUErb_str_getbyte(VALUE str, VALUE index){    long pos = NUM2LONG(index);    if (pos < 0)        pos += RSTRING_LEN(str);    if (pos < 0 ||  RSTRING_LEN(str) <= pos)        return Qnil;    return INT2FIX((unsigned char)RSTRING_PTR(str)[pos]);}

Returns the byte at zero-basedindex as an integer:

s ='foo's.getbyte(0)# => 102s.getbyte(1)# => 111s.getbyte(2)# => 111

Counts backward from the end ifindex is negative:

s.getbyte(-3)# => 102

Returnsnil ifindex is out of range:

s.getbyte(3)# => nils.getbyte(-4)# => nil

More examples:

s ='тест's.bytes# => [209, 130, 208, 181, 209, 129, 209, 130]s.getbyte(2)# => 208s ='こんにちは's.bytes# => [227, 129, 147, 227, 130, 147, 227, 129, 171, 227, 129, 161, 227, 129, 175]s.getbyte(2)# => 147

Related: seeConverting to Non-String.

Source
static VALUErb_str_grapheme_clusters(VALUE str){    VALUE ary = WANTARRAY("grapheme_clusters", rb_str_strlen(str));    return rb_str_enumerate_grapheme_clusters(str, ary);}

Returns an array of the grapheme clusters inself (seeUnicode Grapheme Cluster Boundaries):

s ="ä-pqr-b̈-xyz-c̈"s.size# => 16s.bytesize# => 19s.grapheme_clusters.size# => 13s.grapheme_clusters# => ["ä", "-", "p", "q", "r", "-", "b̈", "-", "x", "y", "z", "-", "c̈"]

Details:

s ="ä"s.grapheme_clusters# => ["ä"]           # One grapheme cluster.s.bytes# => [97, 204, 136]  # Three bytes.s.chars# => ["a", "̈"]       # Two characters.s.chars.map {|char|char.ord }# => [97, 776]       # Their values.

Related: seeConverting to Non-String.

Source
static VALUErb_str_gsub(int argc, VALUE *argv, VALUE str){    return str_gsub(argc, argv, str, 0);}

Returns a copy ofself with zero or more substrings replaced.

Argumentpattern may be a string or aRegexp; argumentreplacement may be a string or aHash. Varying types for the argument values makes this method very versatile.

Below are some simple examples; for many more examples, seeSubstitution Methods.

With argumentspattern and stringreplacement given, replaces each matching substring with the givenreplacement string:

s ='abracadabra's.gsub('ab','AB')# => "ABracadABra"s.gsub(/[a-c]/,'X')# => "XXrXXXdXXrX"

With argumentspattern and hashreplacement given, replaces each matching substring with a value from the givenreplacement hash, or removes it:

h = {'a'=>'A','b'=>'B','c'=>'C'}s.gsub(/[a-c]/,h)# => "ABrACAdABrA"  # 'a', 'b', 'c' replaced.s.gsub(/[a-d]/,h)# => "ABrACAABrA"   # 'd' removed.

With argumentpattern and a block given, calls the block with each matching substring; replaces that substring with the block’s return value:

s.gsub(/[a-d]/) {|substring|substring.upcase }# => "ABrACADABrA"

With argumentpattern and no block given, returns a newEnumerator.

Related: seeConverting to New String.

Source
static VALUErb_str_gsub_bang(int argc, VALUE *argv, VALUE str){    str_modify_keep_cr(str);    return str_gsub(argc, argv, str, 1);}

LikeString#gsub, except that:

  • Performs substitutions inself (not in a copy ofself).

  • Returnsself if any characters are removed,nil otherwise.

Related: seeModifying.

Source
static VALUErb_str_hash_m(VALUE str){    st_index_t hval = rb_str_hash(str);    return ST2FIX(hval);}

Returns the integer hash value forself.

Two String objects that have identical content and compatible encodings also have the same hash value; seeObject#hash andEncodings:

s ='foo'h =s.hash# => -569050784h=='foo'.hash# => trueh=='food'.hash# => falseh=='FOO'.hash# => falses0 ="äöü"s1 =s0.encode(Encoding::ISO_8859_1)s0.encoding# => #<Encoding:UTF-8>s1.encoding# => #<Encoding:ISO-8859-1>s0.hash==s1.hash# => false

Related: seeQuerying.

Source
static VALUErb_str_hex(VALUE str){    return rb_str_to_inum(str, 16, FALSE);}

Interprets the leading substring ofself as hexadecimal, possibly signed; returns its value as an integer.

The leading substring is interpreted as hexadecimal when it begins with:

  • One or more character representing hexadecimal digits (each in one of the ranges'0'..'9','a'..'f', or'A'..'F'); the string to be interpreted ends at the first character that does not represent a hexadecimal digit:

    'f'.hex# => 15'11'.hex# => 17'FFF'.hex# => 4095'fffg'.hex# => 4095'foo'.hex# => 15   # 'f' hexadecimal, 'oo' not.'bar'.hex# => 186  # 'ba' hexadecimal, 'r' not.'deadbeef'.hex# => 3735928559
  • '0x' or'0X', followed by one or more hexadecimal digits:

    '0xfff'.hex# => 4095'0xfffg'.hex# => 4095

Any of the above may prefixed with'-', which negates the interpreted value:

'-fff'.hex# => -4095'-0xFFF'.hex# => -4095

For any substring not described above, returns zero:

'xxx'.hex# => 0''.hex# => 0

Note that, unlikeoct, this method interprets only hexadecimal, and not binary, octal, or decimal notations:

'0b111'.hex# => 45329'0o777'.hex# => 0'0d999'.hex# => 55705

Related: SeeConverting to Non-String.

Source
VALUErb_str_include(VALUE str, VALUE arg){    long i;    StringValue(arg);    i = rb_str_index(str, arg, 0);    return RBOOL(i != -1);}

Returns whetherself containsother_string:

s ='bar's.include?('ba')# => trues.include?('ar')# => trues.include?('bar')# => trues.include?('a')# => trues.include?('')# => trues.include?('foo')# => false

Related: seeQuerying.

Source
static VALUErb_str_index_m(int argc, VALUE *argv, VALUE str){    VALUE sub;    VALUE initpos;    rb_encoding *enc = STR_ENC_GET(str);    long pos;    if (rb_scan_args(argc, argv, "11", &sub, &initpos) == 2) {        long slen = str_strlen(str, enc); /* str's enc */        pos = NUM2LONG(initpos);        if (pos < 0 ? (pos += slen) < 0 : pos > slen) {            if (RB_TYPE_P(sub, T_REGEXP)) {                rb_backref_set(Qnil);            }            return Qnil;        }    }    else {        pos = 0;    }    if (RB_TYPE_P(sub, T_REGEXP)) {        pos = str_offset(RSTRING_PTR(str), RSTRING_END(str), pos,                         enc, single_byte_optimizable(str));        if (rb_reg_search(sub, str, pos, 0) >= 0) {            VALUE match = rb_backref_get();            struct re_registers *regs = RMATCH_REGS(match);            pos = rb_str_sublen(str, BEG(0));            return LONG2NUM(pos);        }    }    else {        StringValue(sub);        pos = rb_str_index(str, sub, pos);        if (pos >= 0) {            pos = rb_str_sublen(str, pos);            return LONG2NUM(pos);        }    }    return Qnil;}

Returns the integer position of the first substring that matches the given argumentpattern, ornil if none found.

Whenpattern is a string, returns the index of the first matching substring inself:

'foo'.index('f')# => 0'foo'.index('o')# => 1'foo'.index('oo')# => 1'foo'.index('ooo')# => nil'тест'.index('с')# => 2  # Characters, not bytes.'こんにちは'.index('ち')# => 3

Whenpattern is aRegexp, returns the index of the first match inself:

'foo'.index(/o./)# => 1'foo'.index(/.o/)# => 0

Whenoffset is non-negative, begins the search at positionoffset; the returned index is relative to the beginning ofself:

'bar'.index('r',0)# => 2'bar'.index('r',1)# => 2'bar'.index('r',2)# => 2'bar'.index('r',3)# => nil'bar'.index(/[r-z]/,0)# => 2'тест'.index('с',1)# => 2'тест'.index('с',2)# => 2'тест'.index('с',3)# => nil  # Offset in characters, not bytes.'こんにちは'.index('ち',2)# => 3

With negative integer argumentoffset, selects the search position by counting backward from the end ofself:

'foo'.index('o',-1)# => 2'foo'.index('o',-2)# => 1'foo'.index('o',-3)# => 1'foo'.index('o',-4)# => nil'foo'.index(/o./,-2)# => 1'foo'.index(/.o/,-2)# => 1

Related: seeQuerying.

Alias for:replace
Source
static VALUErb_str_insert(VALUE str, VALUE idx, VALUE str2){    long pos = NUM2LONG(idx);    if (pos == -1) {        return rb_str_append(str, str2);    }    else if (pos < 0) {        pos++;    }    rb_str_update(str, pos, 0, str2);    return str;}

Inserts the givenother_string intoself; returnsself.

If the givenindex is non-negative, insertsother_string at offsetindex:

'foo'.insert(0,'bar')# => "barfoo"'foo'.insert(1,'bar')# => "fbaroo"'foo'.insert(3,'bar')# => "foobar"'тест'.insert(2,'bar')# => "теbarст"  # Characters, not bytes.'こんにちは'.insert(2,'bar')# => "こんbarにちは"

If theindex is negative, counts backward from the end ofself and insertsother_stringafter the offset:

'foo'.insert(-2,'bar')# => "fobaro"

Related: seeModifying.

Source
VALUErb_str_inspect(VALUE str){    int encidx = ENCODING_GET(str);    rb_encoding *enc = rb_enc_from_index(encidx);    const char *p, *pend, *prev;    char buf[CHAR_ESC_LEN + 1];    VALUE result = rb_str_buf_new(0);    rb_encoding *resenc = rb_default_internal_encoding();    int unicode_p = rb_enc_unicode_p(enc);    int asciicompat = rb_enc_asciicompat(enc);    if (resenc == NULL) resenc = rb_default_external_encoding();    if (!rb_enc_asciicompat(resenc)) resenc = rb_usascii_encoding();    rb_enc_associate(result, resenc);    str_buf_cat2(result, "\"");    p = RSTRING_PTR(str); pend = RSTRING_END(str);    prev = p;    while (p < pend) {        unsigned int c, cc;        int n;        n = rb_enc_precise_mbclen(p, pend, enc);        if (!MBCLEN_CHARFOUND_P(n)) {            if (p > prev) str_buf_cat(result, prev, p - prev);            n = rb_enc_mbminlen(enc);            if (pend < p + n)                n = (int)(pend - p);            while (n--) {                snprintf(buf, CHAR_ESC_LEN, "\\x%02X", *p & 0377);                str_buf_cat(result, buf, strlen(buf));                prev = ++p;            }            continue;        }        n = MBCLEN_CHARFOUND_LEN(n);        c = rb_enc_mbc_to_codepoint(p, pend, enc);        p += n;        if ((asciicompat || unicode_p) &&          (c == '"'|| c == '\\' ||            (c == '#' &&             p < pend &&             MBCLEN_CHARFOUND_P(rb_enc_precise_mbclen(p,pend,enc)) &&             (cc = rb_enc_codepoint(p,pend,enc),              (cc == '$' || cc == '@' || cc == '{'))))) {            if (p - n > prev) str_buf_cat(result, prev, p - n - prev);            str_buf_cat2(result, "\\");            if (asciicompat || enc == resenc) {                prev = p - n;                continue;            }        }        switch (c) {          case '\n': cc = 'n'; break;          case '\r': cc = 'r'; break;          case '\t': cc = 't'; break;          case '\f': cc = 'f'; break;          case '\013': cc = 'v'; break;          case '\010': cc = 'b'; break;          case '\007': cc = 'a'; break;          case 033: cc = 'e'; break;          default: cc = 0; break;        }        if (cc) {            if (p - n > prev) str_buf_cat(result, prev, p - n - prev);            buf[0] = '\\';            buf[1] = (char)cc;            str_buf_cat(result, buf, 2);            prev = p;            continue;        }        /* The special casing of 0x85 (NEXT_LINE) here is because         * Oniguruma historically treats it as printable, but it         * doesn't match the print POSIX bracket class or character         * property in regexps.         *         * See Ruby Bug #16842 for details:         * https://bugs.ruby-lang.org/issues/16842         */        if ((enc == resenc && rb_enc_isprint(c, enc) && c != 0x85) ||            (asciicompat && rb_enc_isascii(c, enc) && ISPRINT(c))) {            continue;        }        else {            if (p - n > prev) str_buf_cat(result, prev, p - n - prev);            rb_str_buf_cat_escaped_char(result, c, unicode_p);            prev = p;            continue;        }    }    if (p > prev) str_buf_cat(result, prev, p - prev);    str_buf_cat2(result, "\"");    return result;}

Returns a printable version ofself, enclosed in double-quotes.

Most printable characters are rendered simply as themselves:

'abc'.inspect# => "\"abc\""'012'.inspect# => "\"012\""''.inspect# => "\"\"""\u000012".inspect# => "\"\\u000012\""'тест'.inspect# => "\"тест\""'こんにちは'.inspect# => "\"こんにちは\""

But printable characters double-quote ('"') and backslash and ('\') are escaped:

'"'.inspect# => "\"\\\"\""'\\'.inspect# => "\"\\\\\""

Unprintable characters are theASCII characters whose values are in range0..31, along with the character whose value is127.

Most of these characters are rendered thus:

0.chr.inspect# => "\"\\x00\""1.chr.inspect# => "\"\\x01\""2.chr.inspect# => "\"\\x02\""# ...

A few, however, have special renderings:

7.chr.inspect# => "\"\\a\""  # BEL8.chr.inspect# => "\"\\b\""  # BS9.chr.inspect# => "\"\\t\""  # TAB10.chr.inspect# => "\"\\n\""  # LF11.chr.inspect# => "\"\\v\""  # VT12.chr.inspect# => "\"\\f\""  # FF13.chr.inspect# => "\"\\r\""  # CR27.chr.inspect# => "\"\\e\""  # ESC

Related: seeConverting to Non-String.

Source
VALUErb_str_intern(VALUE str){    return sym_find_or_insert_dynamic_symbol(&ruby_global_symbols, str);}

Returns theSymbol object derived fromself, creating it if it did not already exist:

'foo'.intern# => :foo'тест'.intern# => :тест'こんにちは'.intern# => :こんにちは

Related: seeConverting to Non-String.

Also aliased as:to_sym
Source
VALUErb_str_length(VALUE str){    return LONG2NUM(str_strlen(str, NULL));}

Returns the count of characters (not bytes) inself:

'foo'.length# => 3'тест'.length# => 4'こんにちは'.length# => 5

Contrast withString#bytesize:

'foo'.bytesize# => 3'тест'.bytesize# => 8'こんにちは'.bytesize# => 15

Related: seeQuerying.

Also aliased as:size
Source
static VALUErb_str_lines(int argc, VALUE *argv, VALUE str){    VALUE ary = WANTARRAY("lines", 0);    return rb_str_enumerate_lines(argc, argv, str, ary);}

Returns substrings (“lines”) ofself according to the given arguments:

s =<<~EOTThis is the first line.This is line two.This is line four.This is line five.EOT

With the default argument values:

$/# => "\n"s.lines# =>["This is the first line.\n","This is line two.\n","\n","This is line four.\n","This is line five.\n"]

With a differentrecord_separator:

record_separator =' is 's.lines(record_separator)# =>["This is ","the first line.\nThis is ","line two.\n\nThis is ","line four.\nThis is ","line five.\n"]

With keyword argumentchomp astrue, removes the trailing newline from each line:

s.lines(chomp:true)# =>["This is the first line.","This is line two.","","This is line four.","This is line five."]

Related: seeConverting to Non-String.

Source
static VALUErb_str_ljust(int argc, VALUE *argv, VALUE str){    return rb_str_justify(argc, argv, str, 'l');}

Returns a copy ofself, left-justified and, if necessary, right-padded with thepad_string:

'hello'.ljust(10)# => "hello     "'  hello'.ljust(10)# => "  hello   "'hello'.ljust(10,'ab')# => "helloababa"'тест'.ljust(10)# => "тест      "'こんにちは'.ljust(10)# => "こんにちは     "

Ifwidth <= self.length, returns a copy ofself:

'hello'.ljust(5)# => "hello"'hello'.ljust(1)# => "hello"  # Does not truncate to width.

Related: seeConverting to New String.

Source
static VALUErb_str_lstrip(VALUE str){    char *start;    long len, loffset;    RSTRING_GETMEM(str, start, len);    loffset = lstrip_offset(str, start, start+len, STR_ENC_GET(str));    if (loffset <= 0) return str_duplicate(rb_cString, str);    return rb_str_subseq(str, loffset, len - loffset);}

Returns a copy ofself with leading whitespace removed; seeWhitespace in Strings:

whitespace ="\x00\t\n\v\f\r "s =whitespace+'abc'+whitespace# => "\u0000\t\n\v\f\r abc\u0000\t\n\v\f\r "s.lstrip# => "abc\u0000\t\n\v\f\r "

Related: seeConverting to New String.

Source
static VALUErb_str_lstrip_bang(VALUE str){    rb_encoding *enc;    char *start, *s;    long olen, loffset;    str_modify_keep_cr(str);    enc = STR_ENC_GET(str);    RSTRING_GETMEM(str, start, olen);    loffset = lstrip_offset(str, start, start+olen, enc);    if (loffset > 0) {        long len = olen-loffset;        s = start + loffset;        memmove(start, s, len);        STR_SET_LEN(str, len);        TERM_FILL(start+len, rb_enc_mbminlen(enc));        return str;    }    return Qnil;}

LikeString#lstrip, except that:

  • Performs stripping inself (not in a copy ofself).

  • Returnsself if any characters are stripped,nil otherwise.

Related: seeModifying.

Source
static VALUErb_str_match_m(int argc, VALUE *argv, VALUE str){    VALUE re, result;    if (argc < 1)        rb_check_arity(argc, 1, 2);    re = argv[0];    argv[0] = str;    result = rb_funcallv(get_pat(re), rb_intern("match"), argc, argv);    if (!NIL_P(result) && rb_block_given_p()) {        return rb_yield(result);    }    return result;}

Creates aMatchData object based onself and the given arguments; updatesRegexp Global Variables.

  • Computesregexp by convertingpattern (if not already aRegexp).

    regexp =Regexp.new(pattern)
  • Computesmatchdata, which will be either aMatchData object ornil (seeRegexp#match):

    matchdata =regexp.match(self[offset..])

With no block given, returns the computedmatchdata ornil:

'foo'.match('f')# => #<MatchData "f">'foo'.match('o')# => #<MatchData "o">'foo'.match('x')# => nil'foo'.match('f',1)# => nil'foo'.match('o',1)# => #<MatchData "o">

With a block given and computedmatchdata non-nil, calls the block withmatchdata; returns the block’s return value:

'foo'.match(/o/) {|matchdata|matchdata }# => #<MatchData "o">

With a block given andnilmatchdata, does not call the block:

'foo'.match(/x/) {|matchdata|fail'Cannot happen' }# => nil

Related: seeQuerying.

Source
static VALUErb_str_match_m_p(int argc, VALUE *argv, VALUE str){    VALUE re;    rb_check_arity(argc, 1, 2);    re = get_pat(argv[0]);    return rb_reg_match_p(re, str, argc > 1 ? NUM2LONG(argv[1]) : 0);}

Returns whether a match is found forself and the given arguments; does not updateRegexp Global Variables.

Computesregexp by convertingpattern (if not already aRegexp):

regexp =Regexp.new(pattern)

Returnstrue ifself[offset..].match(regexp) returns aMatchData object,false otherwise:

'foo'.match?(/o/)# => true'foo'.match?('o')# => true'foo'.match?(/x/)# => false'foo'.match?('f',1)# => false'foo'.match?('o',1)# => true

Related: seeQuerying.

Alias for:succ
Alias for:succ!
Source
static VALUErb_str_oct(VALUE str){    return rb_str_to_inum(str, -8, FALSE);}

Interprets the leading substring ofself as octal, binary, decimal, or hexadecimal, possibly signed; returns their value as an integer.

In brief:

# Interpreted as octal.'777'.oct# => 511'777x'.oct# => 511'0777'.oct# => 511'0o777'.oct# => 511'-777'.oct# => -511# Not interpreted as octal.'0b111'.oct# => 7     # Interpreted as binary.'0d999'.oct# => 999   # Interpreted as decimal.'0xfff'.oct# => 4095  # Interpreted as hexadecimal.

The leading substring is interpreted as octal when it begins with:

  • One or more character representing octal digits (each in the range'0'..'7'); the string to be interpreted ends at the first character that does not represent an octal digit:

    '7'.oct      @ => 7'11'.oct     # => 9'777'.oct    # => 511'0777'.oct   # => 511'7778'.oct   # => 511'777x'.oct   # => 511
  • '0o', followed by one or more octal digits:

    '0o777'.oct# => 511'0o7778'.oct# => 511

The leading substring isnot interpreted as octal when it begins with:

  • '0b', followed by one or more characters representing binary digits (each in the range'0'..'1'); the string to be interpreted ends at the first character that does not represent a binary digit. the string is interpreted as binary digits (base 2):

    '0b111'.oct# => 7'0b1112'.oct# => 7
  • '0d', followed by one or more characters representing decimal digits (each in the range'0'..'9'); the string to be interpreted ends at the first character that does not represent a decimal digit. the string is interpreted as decimal digits (base 10):

    '0d999'.oct# => 999'0d999x'.oct# => 999
  • '0x', followed by one or more characters representing hexadecimal digits (each in one of the ranges'0'..'9','a'..'f', or'A'..'F'); the string to be interpreted ends at the first character that does not represent a hexadecimal digit. the string is interpreted as hexadecimal digits (base 16):

    '0xfff'.oct# => 4095'0xfffg'.oct# => 4095

Any of the above may prefixed with'-', which negates the interpreted value:

'-777'.oct# => -511'-0777'.oct# => -511'-0b111'.oct# => -7'-0xfff'.oct# => -4095

For any substring not described above, returns zero:

'foo'.oct# => 0''.oct# => 0

Related: seeConverting to Non-String.

Source
static VALUErb_str_ord(VALUE s){    unsigned int c;    c = rb_enc_codepoint(RSTRING_PTR(s), RSTRING_END(s), STR_ENC_GET(s));    return UINT2NUM(c);}

Returns the integer ordinal of the first character ofself:

'h'.ord# => 104'hello'.ord# => 104'тест'.ord# => 1090'こんにちは'.ord# => 12371

Related: seeConverting to Non-String.

Source
static VALUErb_str_partition(VALUE str, VALUE sep){    long pos;    sep = get_pat_quoted(sep, 0);    if (RB_TYPE_P(sep, T_REGEXP)) {        if (rb_reg_search(sep, str, 0, 0) < 0) {            goto failed;        }        VALUE match = rb_backref_get();        struct re_registers *regs = RMATCH_REGS(match);        pos = BEG(0);        sep = rb_str_subseq(str, pos, END(0) - pos);    }    else {        pos = rb_str_index(str, sep, 0);        if (pos < 0) goto failed;    }    return rb_ary_new3(3, rb_str_subseq(str, 0, pos),                          sep,                          rb_str_subseq(str, pos+RSTRING_LEN(sep),                                             RSTRING_LEN(str)-pos-RSTRING_LEN(sep)));  failed:    return rb_ary_new3(3, str_duplicate(rb_cString, str), str_new_empty_String(str), str_new_empty_String(str));}

Returns a 3-element array of substrings ofself.

Ifpattern is matched, returns the array:

[pre_match,first_match,post_match]

where:

  • first_match is the first-found matching substring.

  • pre_match andpost_match are the preceding and following substrings.

Ifpattern is not matched, returns the array:

[self.dup,"",""]

Note that in the examples below, a returned string'hello' is a copy ofself, notself.

Ifpattern is aRegexp, performs the equivalent ofself.match(pattern) (also settingpattern-matching global variables):

'hello'.partition(/h/)# => ["", "h", "ello"]'hello'.partition(/l/)# => ["he", "l", "lo"]'hello'.partition(/l+/)# => ["he", "ll", "o"]'hello'.partition(/o/)# => ["hell", "o", ""]'hello'.partition(/^/)# => ["", "", "hello"]'hello'.partition(//)# => ["", "", "hello"]'hello'.partition(/$/)# => ["hello", "", ""]'hello'.partition(/x/)# => ["hello", "", ""]

Ifpattern is not aRegexp, converts it to a string (if it is not already one), then performs the equivalent ofself.index(pattern) (and doesnot setpattern-matching global variables):

'hello'.partition('h')# => ["", "h", "ello"]'hello'.partition('l')# => ["he", "l", "lo"]'hello'.partition('ll')# => ["he", "ll", "o"]'hello'.partition('o')# => ["hell", "o", ""]'hello'.partition('')# => ["", "", "hello"]'hello'.partition('x')# => ["hello", "", ""]'тест'.partition('т')# => ["", "т", "ест"]'こんにちは'.partition('に')# => ["こん", "に", "ちは"]

Related: seeConverting to Non-String.

Source
static VALUErb_str_prepend_multi(int argc, VALUE *argv, VALUE str){    str_modifiable(str);    if (argc == 1) {        rb_str_update(str, 0L, 0L, argv[0]);    }    else if (argc > 1) {        int i;        VALUE arg_str = rb_str_tmp_new(0);        rb_enc_copy(arg_str, str);        for (i = 0; i < argc; i++) {            rb_str_append(arg_str, argv[i]);        }        rb_str_update(str, 0L, 0L, arg_str);    }    return str;}

Prefixes toself the concatenation of the givenother_strings; returnsself:

'baz'.prepend('foo','bar')# => "foobarbaz"

Related: seeModifying.

Source
VALUErb_str_replace(VALUE str, VALUE str2){    str_modifiable(str);    if (str == str2) return str;    StringValue(str2);    str_discard(str);    return str_replace(str, str2);}

Replaces the contents ofself with the contents ofother_string; returnsself:

s ='foo'# => "foo"s.replace('bar')# => "bar"

Related: seeModifying.

Also aliased as:initialize_copy
Source
static VALUErb_str_reverse(VALUE str){    rb_encoding *enc;    VALUE rev;    char *s, *e, *p;    int cr;    if (RSTRING_LEN(str) <= 1) return str_duplicate(rb_cString, str);    enc = STR_ENC_GET(str);    rev = rb_str_new(0, RSTRING_LEN(str));    s = RSTRING_PTR(str); e = RSTRING_END(str);    p = RSTRING_END(rev);    cr = ENC_CODERANGE(str);    if (RSTRING_LEN(str) > 1) {        if (single_byte_optimizable(str)) {            while (s < e) {                *--p = *s++;            }        }        else if (cr == ENC_CODERANGE_VALID) {            while (s < e) {                int clen = rb_enc_fast_mbclen(s, e, enc);                p -= clen;                memcpy(p, s, clen);                s += clen;            }        }        else {            cr = rb_enc_asciicompat(enc) ?                ENC_CODERANGE_7BIT : ENC_CODERANGE_VALID;            while (s < e) {                int clen = rb_enc_mbclen(s, e, enc);                if (clen > 1 || (*s & 0x80)) cr = ENC_CODERANGE_UNKNOWN;                p -= clen;                memcpy(p, s, clen);                s += clen;            }        }    }    STR_SET_LEN(rev, RSTRING_LEN(str));    str_enc_copy_direct(rev, str);    ENC_CODERANGE_SET(rev, cr);    return rev;}

Returns a new string with the characters fromself in reverse order.

'drawer'.reverse# => "reward"'reviled'.reverse# => "deliver"'stressed'.reverse# => "desserts"'semordnilaps'.reverse# => "spalindromes"

Related: seeConverting to New String.

Source
static VALUErb_str_reverse_bang(VALUE str){    if (RSTRING_LEN(str) > 1) {        if (single_byte_optimizable(str)) {            char *s, *e, c;            str_modify_keep_cr(str);            s = RSTRING_PTR(str);            e = RSTRING_END(str) - 1;            while (s < e) {                c = *s;                *s++ = *e;                *e-- = c;            }        }        else {            str_shared_replace(str, rb_str_reverse(str));        }    }    else {        str_modify_keep_cr(str);    }    return str;}

Returnsself with its characters reversed:

'drawer'.reverse!# => "reward"'reviled'.reverse!# => "deliver"'stressed'.reverse!# => "desserts"'semordnilaps'.reverse!# => "spalindromes"

Related: seeModifying.

Source
static VALUErb_str_rindex_m(int argc, VALUE *argv, VALUE str){    VALUE sub;    VALUE initpos;    rb_encoding *enc = STR_ENC_GET(str);    long pos, len = str_strlen(str, enc); /* str's enc */    if (rb_scan_args(argc, argv, "11", &sub, &initpos) == 2) {        pos = NUM2LONG(initpos);        if (pos < 0 && (pos += len) < 0) {            if (RB_TYPE_P(sub, T_REGEXP)) {                rb_backref_set(Qnil);            }            return Qnil;        }        if (pos > len) pos = len;    }    else {        pos = len;    }    if (RB_TYPE_P(sub, T_REGEXP)) {        /* enc = rb_enc_check(str, sub); */        pos = str_offset(RSTRING_PTR(str), RSTRING_END(str), pos,                         enc, single_byte_optimizable(str));        if (rb_reg_search(sub, str, pos, 1) >= 0) {            VALUE match = rb_backref_get();            struct re_registers *regs = RMATCH_REGS(match);            pos = rb_str_sublen(str, BEG(0));            return LONG2NUM(pos);        }    }    else {        StringValue(sub);        pos = rb_str_rindex(str, sub, pos);        if (pos >= 0) {            pos = rb_str_sublen(str, pos);            return LONG2NUM(pos);        }    }    return Qnil;}

Returns the integer position of thelast substring that matches the given argumentpattern, ornil if none found.

Whenpattern is a string, returns the index of the last matching substring in self:

'foo'.rindex('f')       # => 0'foo'.rindex('o')       # => 2'foo'.rindex('oo'       # => 1'foo'.rindex('ooo')     # => nil'тест'.rindex('т')      # => 3'こんにちは'.rindex('ち') # => 3

Whenpattern is aRegexp, returns the index of the last match in self:

'foo'.rindex(/f/)# => 0'foo'.rindex(/o/)# => 2'foo'.rindex(/oo/)# => 1'foo'.rindex(/ooo/)# => nil

Whenoffset is non-negative, it specifies the maximum starting position in the string to end the search:

'foo'.rindex('o',0)# => nil'foo'.rindex('o',1)# => 1'foo'.rindex('o',2)# => 2'foo'.rindex('o',3)# => 2

With negative integer argumentoffset, selects the search position by counting backward from the end ofself:

'foo'.rindex('o',-1)# => 2'foo'.rindex('o',-2)# => 1'foo'.rindex('o',-3)# => nil'foo'.rindex('o',-4)# => nil

The last match means starting at the possible last position, not the last of longest matches:

'foo'.rindex(/o+/)# => 2$~# => #<MatchData "o">

To get the last longest match, combine with negative lookbehind:

'foo'.rindex(/(?<!o)o+/)# => 1$~# => #<MatchData "oo">

OrString#index with negative lookforward.

'foo'.index(/o+(?!.*o)/)# => 1$~# => #<MatchData "oo">

Related: seeQuerying.

Source
static VALUErb_str_rjust(int argc, VALUE *argv, VALUE str){    return rb_str_justify(argc, argv, str, 'r');}

Returns a right-justified copy ofself.

If integer argumentwidth is greater than the size (in characters) ofself, returns a new string of lengthwidth that is a copy ofself, right justified and padded on the left withpad_string:

'hello'.rjust(10)# => "     hello"'hello  '.rjust(10)# => "   hello  "'hello'.rjust(10,'ab')# => "ababahello"'тест'.rjust(10)# => "      тест"'こんにちは'.rjust(10)# => "     こんにちは"

Ifwidth <= self.size, returns a copy ofself:

'hello'.rjust(5,'ab')# => "hello"'hello'.rjust(1,'ab')# => "hello"

Related: seeConverting to New String.

Source
static VALUErb_str_rpartition(VALUE str, VALUE sep){    long pos = RSTRING_LEN(str);    sep = get_pat_quoted(sep, 0);    if (RB_TYPE_P(sep, T_REGEXP)) {        if (rb_reg_search(sep, str, pos, 1) < 0) {            goto failed;        }        VALUE match = rb_backref_get();        struct re_registers *regs = RMATCH_REGS(match);        pos = BEG(0);        sep = rb_str_subseq(str, pos, END(0) - pos);    }    else {        pos = rb_str_sublen(str, pos);        pos = rb_str_rindex(str, sep, pos);        if (pos < 0) {            goto failed;        }    }    return rb_ary_new3(3, rb_str_subseq(str, 0, pos),                          sep,                          rb_str_subseq(str, pos+RSTRING_LEN(sep),                                        RSTRING_LEN(str)-pos-RSTRING_LEN(sep)));  failed:    return rb_ary_new3(3, str_new_empty_String(str), str_new_empty_String(str), str_duplicate(rb_cString, str));}

Returns a 3-element array of substrings ofself.

Searchesself for a match ofpattern, seeking thelast match.

Ifpattern is not matched, returns the array:

["","",self.dup]

Ifpattern is matched, returns the array:

[pre_match,last_match,post_match]

where:

  • last_match is the last-found matching substring.

  • pre_match andpost_match are the preceding and following substrings.

The pattern used is:

Note that in the examples below, a returned string'hello' is a copy ofself, notself.

Ifpattern is aRegexp, searches for the last matching substring (also settingpattern-matching global variables):

'hello'.rpartition(/l/)# => ["hel", "l", "o"]'hello'.rpartition(/ll/)# => ["he", "ll", "o"]'hello'.rpartition(/h/)# => ["", "h", "ello"]'hello'.rpartition(/o/)# => ["hell", "o", ""]'hello'.rpartition(//)# => ["hello", "", ""]'hello'.rpartition(/x/)# => ["", "", "hello"]'тест'.rpartition(/т/)# => ["тес", "т", ""]'こんにちは'.rpartition(/に/)# => ["こん", "に", "ちは"]

Ifpattern is not aRegexp, converts it to a string (if it is not already one), then searches for the last matching substring (and doesnot setpattern-matching global variables):

'hello'.rpartition('l')# => ["hel", "l", "o"]'hello'.rpartition('ll')# => ["he", "ll", "o"]'hello'.rpartition('h')# => ["", "h", "ello"]'hello'.rpartition('o')# => ["hell", "o", ""]'hello'.rpartition('')# => ["hello", "", ""]'тест'.rpartition('т')# => ["тес", "т", ""]'こんにちは'.rpartition('に')# => ["こん", "に", "ちは"]

Related: seeConverting to Non-String.

Source
static VALUErb_str_rstrip(VALUE str){    rb_encoding *enc;    char *start;    long olen, roffset;    enc = STR_ENC_GET(str);    RSTRING_GETMEM(str, start, olen);    roffset = rstrip_offset(str, start, start+olen, enc);    if (roffset <= 0) return str_duplicate(rb_cString, str);    return rb_str_subseq(str, 0, olen-roffset);}

Returns a copy ofself with trailing whitespace removed; seeWhitespace in Strings:

whitespace ="\x00\t\n\v\f\r "s =whitespace+'abc'+whitespaces# => "\u0000\t\n\v\f\r abc\u0000\t\n\v\f\r "s.rstrip# => "\u0000\t\n\v\f\r abc"

Related: seeConverting to New String.

Source
static VALUErb_str_rstrip_bang(VALUE str){    rb_encoding *enc;    char *start;    long olen, roffset;    str_modify_keep_cr(str);    enc = STR_ENC_GET(str);    RSTRING_GETMEM(str, start, olen);    roffset = rstrip_offset(str, start, start+olen, enc);    if (roffset > 0) {        long len = olen - roffset;        STR_SET_LEN(str, len);        TERM_FILL(start+len, rb_enc_mbminlen(enc));        return str;    }    return Qnil;}

LikeString#rstrip, except that:

  • Performs stripping inself (not in a copy ofself).

  • Returnsself if any characters are stripped,nil otherwise.

Related: seeModifying.

Source
static VALUErb_str_scan(VALUE str, VALUE pat){    VALUE result;    long start = 0;    long last = -1, prev = 0;    char *p = RSTRING_PTR(str); long len = RSTRING_LEN(str);    pat = get_pat_quoted(pat, 1);    mustnot_broken(str);    if (!rb_block_given_p()) {        VALUE ary = rb_ary_new();        while (!NIL_P(result = scan_once(str, pat, &start, 0))) {            last = prev;            prev = start;            rb_ary_push(ary, result);        }        if (last >= 0) rb_pat_search(pat, str, last, 1);        else rb_backref_set(Qnil);        return ary;    }    while (!NIL_P(result = scan_once(str, pat, &start, 1))) {        last = prev;        prev = start;        rb_yield(result);        str_mod_check(str, p, len);    }    if (last >= 0) rb_pat_search(pat, str, last, 1);    return str;}

Matches a pattern againstself:

Generates a collection of matching results and updatesregexp-related global variables:

  • If the pattern contains no groups, each result is a matched substring.

  • If the pattern contains groups, each result is an array containing a matched substring for each group.

With no block given, returns an array of the results:

'cruel world'.scan(/\w+/)# => ["cruel", "world"]'cruel world'.scan(/.../)# => ["cru", "el ", "wor"]'cruel world'.scan(/(...)/)# => [["cru"], ["el "], ["wor"]]'cruel world'.scan(/(..)(..)/)# => [["cr", "ue"], ["l ", "wo"]]'тест'.scan(/../)# => ["те", "ст"]'こんにちは'.scan(/../)# => ["こん", "にち"]'abracadabra'.scan('ab')# => ["ab", "ab"]'abracadabra'.scan('nosuch')# => []

With a block given, calls the block with each result; returnsself:

'cruel world'.scan(/\w+/) {|w|pw }# => "cruel"# => "world"'cruel world'.scan(/(.)(.)/) {|x,y|p [x,y] }# => ["c", "r"]# => ["u", "e"]# => ["l", " "]# => ["w", "o"]# => ["r", "l"]

Related: seeConverting to Non-String.

Source
static VALUEstr_scrub(int argc, VALUE *argv, VALUE str){    VALUE repl = argc ? (rb_check_arity(argc, 0, 1), argv[0]) : Qnil;    VALUE new = rb_str_scrub(str, repl);    return NIL_P(new) ? str_duplicate(rb_cString, str): new;}

Returns a copy ofself with each invalid byte sequence replaced by the givenreplacement_string.

With no block given, replaces each invalid sequence with the givendefault_replacement_string (by default,"�" for a Unicode encoding,'?' otherwise):

"foo\x81\x81bar"scrub                             # => "foo��bar""foo\x81\x81bar".force_encoding('US-ASCII').scrub # => "foo??bar""foo\x81\x81bar".scrub('xyzzy')                   # => "fooxyzzyxyzzybar"

With a block given, calls the block with each invalid sequence, and replaces that sequence with the return value of the block:

"foo\x81\x81bar".scrub {|sequence|psequence;'XYZZY' }# => "fooXYZZYXYZZYbar"

Output :

"\x81""\x81"

Related: seeConverting to New String.

Source
static VALUEstr_scrub_bang(int argc, VALUE *argv, VALUE str){    VALUE repl = argc ? (rb_check_arity(argc, 0, 1), argv[0]) : Qnil;    VALUE new = rb_str_scrub(str, repl);    if (!NIL_P(new)) rb_str_replace(str, new);    return str;}

LikeString#scrub, except that:

  • Any replacements are made inself.

  • Returnsself.

Related: seeModifying.

Source
VALUErb_str_setbyte(VALUE str, VALUE index, VALUE value){    long pos = NUM2LONG(index);    long len = RSTRING_LEN(str);    char *ptr, *head, *left = 0;    rb_encoding *enc;    int cr = ENC_CODERANGE_UNKNOWN, width, nlen;    if (pos < -len || len <= pos)        rb_raise(rb_eIndexError, "index %ld out of string", pos);    if (pos < 0)        pos += len;    VALUE v = rb_to_int(value);    VALUE w = rb_int_and(v, INT2FIX(0xff));    char byte = (char)(NUM2INT(w) & 0xFF);    if (!str_independent(str))        str_make_independent(str);    enc = STR_ENC_GET(str);    head = RSTRING_PTR(str);    ptr = &head[pos];    if (!STR_EMBED_P(str)) {        cr = ENC_CODERANGE(str);        switch (cr) {          case ENC_CODERANGE_7BIT:            left = ptr;            *ptr = byte;            if (ISASCII(byte)) goto end;            nlen = rb_enc_precise_mbclen(left, head+len, enc);            if (!MBCLEN_CHARFOUND_P(nlen))                ENC_CODERANGE_SET(str, ENC_CODERANGE_BROKEN);            else                ENC_CODERANGE_SET(str, ENC_CODERANGE_VALID);            goto end;          case ENC_CODERANGE_VALID:            left = rb_enc_left_char_head(head, ptr, head+len, enc);            width = rb_enc_precise_mbclen(left, head+len, enc);            *ptr = byte;            nlen = rb_enc_precise_mbclen(left, head+len, enc);            if (!MBCLEN_CHARFOUND_P(nlen))                ENC_CODERANGE_SET(str, ENC_CODERANGE_BROKEN);            else if (MBCLEN_CHARFOUND_LEN(nlen) != width || ISASCII(byte))                ENC_CODERANGE_CLEAR(str);            goto end;        }    }    ENC_CODERANGE_CLEAR(str);    *ptr = byte;  end:    return value;}

Sets the byte at zero-based offsetindex to the value of the giveninteger; returnsinteger:

s ='xyzzy's.setbyte(2,129)# => 129s# => "xy\x81zy"

Related: seeModifying.

Source
# File lib/shellwords.rb, line 238defshellescapeShellwords.escape(self)end

Escapesstr so that it can be safely used in a Bourne shell command line.

SeeShellwords.shellescape for details.

Source
# File lib/shellwords.rb, line 227defshellsplitShellwords.split(self)end

Splitsstr into an array of tokens in the same way the UNIX Bourne shell does.

SeeShellwords.shellsplit for details.

Alias for:length
Alias for:[]
Source
static VALUErb_str_slice_bang(int argc, VALUE *argv, VALUE str){    VALUE result = Qnil;    VALUE indx;    long beg, len = 1;    char *p;    rb_check_arity(argc, 1, 2);    str_modify_keep_cr(str);    indx = argv[0];    if (RB_TYPE_P(indx, T_REGEXP)) {        if (rb_reg_search(indx, str, 0, 0) < 0) return Qnil;        VALUE match = rb_backref_get();        struct re_registers *regs = RMATCH_REGS(match);        int nth = 0;        if (argc > 1 && (nth = rb_reg_backref_number(match, argv[1])) < 0) {            if ((nth += regs->num_regs) <= 0) return Qnil;        }        else if (nth >= regs->num_regs) return Qnil;        beg = BEG(nth);        len = END(nth) - beg;        goto subseq;    }    else if (argc == 2) {        beg = NUM2LONG(indx);        len = NUM2LONG(argv[1]);        goto num_index;    }    else if (FIXNUM_P(indx)) {        beg = FIX2LONG(indx);        if (!(p = rb_str_subpos(str, beg, &len))) return Qnil;        if (!len) return Qnil;        beg = p - RSTRING_PTR(str);        goto subseq;    }    else if (RB_TYPE_P(indx, T_STRING)) {        beg = rb_str_index(str, indx, 0);        if (beg == -1) return Qnil;        len = RSTRING_LEN(indx);        result = str_duplicate(rb_cString, indx);        goto squash;    }    else {        switch (rb_range_beg_len(indx, &beg, &len, str_strlen(str, NULL), 0)) {          case Qnil:            return Qnil;          case Qfalse:            beg = NUM2LONG(indx);            if (!(p = rb_str_subpos(str, beg, &len))) return Qnil;            if (!len) return Qnil;            beg = p - RSTRING_PTR(str);            goto subseq;          default:            goto num_index;        }    }  num_index:    if (!(p = rb_str_subpos(str, beg, &len))) return Qnil;    beg = p - RSTRING_PTR(str);  subseq:    result = rb_str_new(RSTRING_PTR(str)+beg, len);    rb_enc_cr_str_copy_for_substr(result, str);  squash:    if (len > 0) {        if (beg == 0) {            rb_str_drop_bytes(str, len);        }        else {            char *sptr = RSTRING_PTR(str);            long slen = RSTRING_LEN(str);            if (beg + len > slen) /* pathological check */                len = slen - beg;            memmove(sptr + beg,                    sptr + beg + len,                    slen - (beg + len));            slen -= len;            STR_SET_LEN(str, slen);            TERM_FILL(&sptr[slen], TERM_LEN(str));        }    }    return result;}

LikeString#[] (and its aliasString#slice), except that:

  • Performs substitutions inself (not in a copy ofself).

  • Returns the removed substring if any modifications were made,nil otherwise.

A few examples:

s ='hello's.slice!('e')# => "e"s# => "hllo"s.slice!('e')# => nils# => "hllo"

Related: seeModifying.

Source
static VALUErb_str_split_m(int argc, VALUE *argv, VALUE str){    rb_encoding *enc;    VALUE spat;    VALUE limit;    split_type_t split_type;    long beg, end, i = 0, empty_count = -1;    int lim = 0;    VALUE result, tmp;    result = rb_block_given_p() ? Qfalse : Qnil;    if (rb_scan_args(argc, argv, "02", &spat, &limit) == 2) {        lim = NUM2INT(limit);        if (lim <= 0) limit = Qnil;        else if (lim == 1) {            if (RSTRING_LEN(str) == 0)                return result ? rb_ary_new2(0) : str;            tmp = str_duplicate(rb_cString, str);            if (!result) {                rb_yield(tmp);                return str;            }            return rb_ary_new3(1, tmp);        }        i = 1;    }    if (NIL_P(limit) && !lim) empty_count = 0;    enc = STR_ENC_GET(str);    split_type = SPLIT_TYPE_REGEXP;    if (!NIL_P(spat)) {        spat = get_pat_quoted(spat, 0);    }    else if (NIL_P(spat = rb_fs)) {        split_type = SPLIT_TYPE_AWK;    }    else if (!(spat = rb_fs_check(spat))) {        rb_raise(rb_eTypeError, "value of $; must be String or Regexp");    }    else {        rb_category_warn(RB_WARN_CATEGORY_DEPRECATED, "$; is set to non-nil value");    }    if (split_type != SPLIT_TYPE_AWK) {        switch (BUILTIN_TYPE(spat)) {          case T_REGEXP:            rb_reg_options(spat); /* check if uninitialized */            tmp = RREGEXP_SRC(spat);            split_type = literal_split_pattern(tmp, SPLIT_TYPE_REGEXP);            if (split_type == SPLIT_TYPE_AWK) {                spat = tmp;                split_type = SPLIT_TYPE_STRING;            }            break;          case T_STRING:            mustnot_broken(spat);            split_type = literal_split_pattern(spat, SPLIT_TYPE_STRING);            break;          default:            UNREACHABLE_RETURN(Qnil);        }    }#define SPLIT_STR(beg, len) ( \        empty_count = split_string(result, str, beg, len, empty_count), \        str_mod_check(str, str_start, str_len))    beg = 0;    char *ptr = RSTRING_PTR(str);    char *const str_start = ptr;    const long str_len = RSTRING_LEN(str);    char *const eptr = str_start + str_len;    if (split_type == SPLIT_TYPE_AWK) {        char *bptr = ptr;        int skip = 1;        unsigned int c;        if (result) result = rb_ary_new();        end = beg;        if (is_ascii_string(str)) {            while (ptr < eptr) {                c = (unsigned char)*ptr++;                if (skip) {                    if (ascii_isspace(c)) {                        beg = ptr - bptr;                    }                    else {                        end = ptr - bptr;                        skip = 0;                        if (!NIL_P(limit) && lim <= i) break;                    }                }                else if (ascii_isspace(c)) {                    SPLIT_STR(beg, end-beg);                    skip = 1;                    beg = ptr - bptr;                    if (!NIL_P(limit)) ++i;                }                else {                    end = ptr - bptr;                }            }        }        else {            while (ptr < eptr) {                int n;                c = rb_enc_codepoint_len(ptr, eptr, &n, enc);                ptr += n;                if (skip) {                    if (rb_isspace(c)) {                        beg = ptr - bptr;                    }                    else {                        end = ptr - bptr;                        skip = 0;                        if (!NIL_P(limit) && lim <= i) break;                    }                }                else if (rb_isspace(c)) {                    SPLIT_STR(beg, end-beg);                    skip = 1;                    beg = ptr - bptr;                    if (!NIL_P(limit)) ++i;                }                else {                    end = ptr - bptr;                }            }        }    }    else if (split_type == SPLIT_TYPE_STRING) {        char *substr_start = ptr;        char *sptr = RSTRING_PTR(spat);        long slen = RSTRING_LEN(spat);        if (result) result = rb_ary_new();        mustnot_broken(str);        enc = rb_enc_check(str, spat);        while (ptr < eptr &&               (end = rb_memsearch(sptr, slen, ptr, eptr - ptr, enc)) >= 0) {            /* Check we are at the start of a char */            char *t = rb_enc_right_char_head(ptr, ptr + end, eptr, enc);            if (t != ptr + end) {                ptr = t;                continue;            }            SPLIT_STR(substr_start - str_start, (ptr+end) - substr_start);            str_mod_check(spat, sptr, slen);            ptr += end + slen;            substr_start = ptr;            if (!NIL_P(limit) && lim <= ++i) break;        }        beg = ptr - str_start;    }    else if (split_type == SPLIT_TYPE_CHARS) {        int n;        if (result) result = rb_ary_new_capa(RSTRING_LEN(str));        mustnot_broken(str);        enc = rb_enc_get(str);        while (ptr < eptr &&               (n = rb_enc_precise_mbclen(ptr, eptr, enc)) > 0) {            SPLIT_STR(ptr - str_start, n);            ptr += n;            if (!NIL_P(limit) && lim <= ++i) break;        }        beg = ptr - str_start;    }    else {        if (result) result = rb_ary_new();        long len = RSTRING_LEN(str);        long start = beg;        long idx;        int last_null = 0;        struct re_registers *regs;        VALUE match = 0;        for (; rb_reg_search(spat, str, start, 0) >= 0;             (match ? (rb_match_unbusy(match), rb_backref_set(match)) : (void)0)) {            match = rb_backref_get();            if (!result) rb_match_busy(match);            regs = RMATCH_REGS(match);            end = BEG(0);            if (start == end && BEG(0) == END(0)) {                if (!ptr) {                    SPLIT_STR(0, 0);                    break;                }                else if (last_null == 1) {                    SPLIT_STR(beg, rb_enc_fast_mbclen(ptr+beg, eptr, enc));                    beg = start;                }                else {                    if (start == len)                        start++;                    else                        start += rb_enc_fast_mbclen(ptr+start,eptr,enc);                    last_null = 1;                    continue;                }            }            else {                SPLIT_STR(beg, end-beg);                beg = start = END(0);            }            last_null = 0;            for (idx=1; idx < regs->num_regs; idx++) {                if (BEG(idx) == -1) continue;                SPLIT_STR(BEG(idx), END(idx)-BEG(idx));            }            if (!NIL_P(limit) && lim <= ++i) break;        }        if (match) rb_match_unbusy(match);    }    if (RSTRING_LEN(str) > 0 && (!NIL_P(limit) || RSTRING_LEN(str) > beg || lim < 0)) {        SPLIT_STR(beg, RSTRING_LEN(str)-beg);    }    return result ? result : str;}

Creates an array of substrings by splittingself at each occurrence of the given field separatorfield_sep.

With no arguments given, splits using the field separator$;, whose default value isnil.

With no block given, returns the array of substrings:

'abracadabra'.split('a')# => ["", "br", "c", "d", "br"]

Whenfield_sep isnil or' ' (a single space), splits at each sequence of whitespace:

'foo bar baz'.split(nil)# => ["foo", "bar", "baz"]'foo bar baz'.split(' ')# => ["foo", "bar", "baz"]"foo \n\tbar\t\n  baz".split(' ')# => ["foo", "bar", "baz"]'foo  bar   baz'.split(' ')# => ["foo", "bar", "baz"]''.split(' ')# => []

Whenfield_sep is an empty string, splits at every character:

'abracadabra'.split('')# => ["a", "b", "r", "a", "c", "a", "d", "a", "b", "r", "a"]''.split('')# => []'тест'.split('')# => ["т", "е", "с", "т"]'こんにちは'.split('')# => ["こ", "ん", "に", "ち", "は"]

Whenfield_sep is a non-empty string and different from' ' (a single space), uses that string as the separator:

'abracadabra'.split('a')# => ["", "br", "c", "d", "br"]'abracadabra'.split('ab')# => ["", "racad", "ra"]''.split('a')# => []'тест'.split('т')# => ["", "ес"]'こんにちは'.split('に')# => ["こん", "ちは"]

Whenfield_sep is aRegexp, splits at each occurrence of a matching substring:

'abracadabra'.split(/ab/)# => ["", "racad", "ra"]'1 + 1 == 2'.split(/\W+/)# => ["1", "1", "2"]'abracadabra'.split(//)# => ["a", "b", "r", "a", "c", "a", "d", "a", "b", "r", "a"]

If the Regexp contains groups, their matches are included in the returned array:

'1:2:3'.split(/(:)()()/,2)# => ["1", ":", "", "", "2:3"]

Argumentlimit sets a limit on the size of the returned array; it also determines whether trailing empty strings are included in the returned array.

Whenlimit is zero, there is no limit on the size of the array, but trailing empty strings are omitted:

'abracadabra'.split('',0)# => ["a", "b", "r", "a", "c", "a", "d", "a", "b", "r", "a"]'abracadabra'.split('a',0)# => ["", "br", "c", "d", "br"]  # Empty string after last 'a' omitted.

Whenlimit is a positive integer, there is a limit on the size of the array (no more thann - 1 splits occur), and trailing empty strings are included:

'abracadabra'.split('',3)# => ["a", "b", "racadabra"]'abracadabra'.split('a',3)# => ["", "br", "cadabra"]'abracadabra'.split('',30)# => ["a", "b", "r", "a", "c", "a", "d", "a", "b", "r", "a", ""]'abracadabra'.split('a',30)# => ["", "br", "c", "d", "br", ""]'abracadabra'.split('',1)# => ["abracadabra"]'abracadabra'.split('a',1)# => ["abracadabra"]

Whenlimit is negative, there is no limit on the size of the array, and trailing empty strings are omitted:

'abracadabra'.split('',-1)# => ["a", "b", "r", "a", "c", "a", "d", "a", "b", "r", "a", ""]'abracadabra'.split('a',-1)# => ["", "br", "c", "d", "br", ""]

If a block is given, it is called with each substring and returnsself:

'foo bar baz'.split(' ') {|substring|psubstring }

Output :

"foo""bar""baz"

Note that the above example is functionally equivalent to:

'foo bar baz'.split(' ').each {|substring|psubstring }

Output :

"foo""bar""baz"

But the latter:

  • Has poorer performance because it creates an intermediate array.

  • Returns an array (instead ofself).

Related: seeConverting to Non-String.

Source
static VALUErb_str_squeeze(int argc, VALUE *argv, VALUE str){    str = str_duplicate(rb_cString, str);    rb_str_squeeze_bang(argc, argv, str);    return str;}

Returns a copy ofself with each tuple (doubling, tripling, etc.) of specified characters “squeezed” down to a single character.

The tuples to be squeezed are specified by argumentsselectors, each of which is a string; seeCharacter Selectors.

A single argument may be a single character:

'Noooooo!'.squeeze('o')# => "No!"'foo  bar  baz'.squeeze(' ')# => "foo bar baz"'Mississippi'.squeeze('s')# => "Misisippi"'Mississippi'.squeeze('p')# => "Mississipi"'Mississippi'.squeeze('x')# => "Mississippi"  # Unused selector character is ignored.'бессонница'.squeeze('с')# => "бесонница"'бессонница'.squeeze('н')# => "бессоница"

A single argument may be a string of characters:

'Mississippi'.squeeze('sp')# => "Misisipi"'Mississippi'.squeeze('ps')# => "Misisipi"   # Order doesn't matter.'Mississippi'.squeeze('nonsense')# => "Misisippi"  # Unused selector characters are ignored.

A single argument may be a range of characters:

'Mississippi'.squeeze('a-p')# => "Mississipi"'Mississippi'.squeeze('q-z')# => "Misisippi"'Mississippi'.squeeze('a-z')# => "Misisipi"

Multiple arguments are allowed; seeMultiple Character Selectors.

Related: seeConverting to New String.

Source
static VALUErb_str_squeeze_bang(int argc, VALUE *argv, VALUE str){    char squeez[TR_TABLE_SIZE];    rb_encoding *enc = 0;    VALUE del = 0, nodel = 0;    unsigned char *s, *send, *t;    int i, modify = 0;    int ascompat, singlebyte = single_byte_optimizable(str);    unsigned int save;    if (argc == 0) {        enc = STR_ENC_GET(str);    }    else {        for (i=0; i<argc; i++) {            VALUE s = argv[i];            StringValue(s);            enc = rb_enc_check(str, s);            if (singlebyte && !single_byte_optimizable(s))                singlebyte = 0;            tr_setup_table(s, squeez, i==0, &del, &nodel, enc);        }    }    str_modify_keep_cr(str);    s = t = (unsigned char *)RSTRING_PTR(str);    if (!s || RSTRING_LEN(str) == 0) return Qnil;    send = (unsigned char *)RSTRING_END(str);    save = -1;    ascompat = rb_enc_asciicompat(enc);    if (singlebyte) {        while (s < send) {            unsigned int c = *s++;            if (c != save || (argc > 0 && !squeez[c])) {                *t++ = save = c;            }        }    }    else {        while (s < send) {            unsigned int c;            int clen;            if (ascompat && (c = *s) < 0x80) {                if (c != save || (argc > 0 && !squeez[c])) {                    *t++ = save = c;                }                s++;            }            else {                c = rb_enc_codepoint_len((char *)s, (char *)send, &clen, enc);                if (c != save || (argc > 0 && !tr_find(c, squeez, del, nodel))) {                    if (t != s) rb_enc_mbcput(c, t, enc);                    save = c;                    t += clen;                }                s += clen;            }        }    }    TERM_FILL((char *)t, TERM_LEN(str));    if ((char *)t - RSTRING_PTR(str) != RSTRING_LEN(str)) {        STR_SET_LEN(str, (char *)t - RSTRING_PTR(str));        modify = 1;    }    if (modify) return str;    return Qnil;}

LikeString#squeeze, except that:

  • Characters are squeezed inself (not in a copy ofself).

  • Returnsself if any changes are made,nil otherwise.

Related: SeeModifying.

Source
static VALUErb_str_start_with(int argc, VALUE *argv, VALUE str){    int i;    for (i=0; i<argc; i++) {        VALUE tmp = argv[i];        if (RB_TYPE_P(tmp, T_REGEXP)) {            if (rb_reg_start_with_p(tmp, str))                return Qtrue;        }        else {            const char *p, *s, *e;            long slen, tlen;            rb_encoding *enc;            StringValue(tmp);            enc = rb_enc_check(str, tmp);            if ((tlen = RSTRING_LEN(tmp)) == 0) return Qtrue;            if ((slen = RSTRING_LEN(str)) < tlen) continue;            p = RSTRING_PTR(str);            e = p + slen;            s = p + tlen;            if (!at_char_right_boundary(p, s, e, enc))                continue;            if (memcmp(p, RSTRING_PTR(tmp), tlen) == 0)                return Qtrue;        }    }    return Qfalse;}

Returns whetherself starts with any of the givenpatterns.

For each argument, the pattern used is:

Returnstrue if any pattern matches the beginning,false otherwise:

'hello'.start_with?('hell')# => true'hello'.start_with?(/H/i)# => true'hello'.start_with?('heaven','hell')# => true'hello'.start_with?('heaven','paradise')# => false'тест'.start_with?('т')# => true'こんにちは'.start_with?('こ')# => true

Related: seeQuerying.

Source
static VALUErb_str_strip(VALUE str){    char *start;    long olen, loffset, roffset;    rb_encoding *enc = STR_ENC_GET(str);    RSTRING_GETMEM(str, start, olen);    loffset = lstrip_offset(str, start, start+olen, enc);    roffset = rstrip_offset(str, start+loffset, start+olen, enc);    if (loffset <= 0 && roffset <= 0) return str_duplicate(rb_cString, str);    return rb_str_subseq(str, loffset, olen-loffset-roffset);}

Returns a copy ofself with leading and trailing whitespace removed; seeWhitespace in Strings:

whitespace ="\x00\t\n\v\f\r "s =whitespace+'abc'+whitespace# => "\u0000\t\n\v\f\r abc\u0000\t\n\v\f\r "s.strip# => "abc"

Related: seeConverting to New String.

Source
static VALUErb_str_strip_bang(VALUE str){    char *start;    long olen, loffset, roffset;    rb_encoding *enc;    str_modify_keep_cr(str);    enc = STR_ENC_GET(str);    RSTRING_GETMEM(str, start, olen);    loffset = lstrip_offset(str, start, start+olen, enc);    roffset = rstrip_offset(str, start+loffset, start+olen, enc);    if (loffset > 0 || roffset > 0) {        long len = olen-roffset;        if (loffset > 0) {            len -= loffset;            memmove(start, start + loffset, len);        }        STR_SET_LEN(str, len);        TERM_FILL(start+len, rb_enc_mbminlen(enc));        return str;    }    return Qnil;}

LikeString#strip, except that:

  • Any modifications are made toself.

  • Returnsself if any modification are made,nil otherwise.

Related: seeModifying.

Source
static VALUErb_str_sub(int argc, VALUE *argv, VALUE str){    str = str_duplicate(rb_cString, str);    rb_str_sub_bang(argc, argv, str);    return str;}

Returns a copy of self, possibly with a substring replaced.

Argumentpattern may be a string or aRegexp; argumentreplacement may be a string or aHash.

Varying types for the argument values makes this method very versatile.

Below are some simple examples; for many more examples, seeSubstitution Methods.

With argumentspattern and stringreplacement given, replaces the first matching substring with the given replacement string:

s ='abracadabra'# => "abracadabra"s.sub('bra','xyzzy')# => "axyzzycadabra"s.sub(/bra/,'xyzzy')# => "axyzzycadabra"s.sub('nope','xyzzy')# => "abracadabra"

With argumentspattern and hashreplacement given, replaces the first matching substring with a value from the given replacement hash, or removes it:

h = {'a'=>'A','b'=>'B','c'=>'C'}s.sub('b',h)# => "aBracadabra"s.sub(/b/,h)# => "aBracadabra"s.sub(/d/,h)# => "abracaabra"  # 'd' removed.

With argumentpattern and a block given, calls the block with each matching substring; replaces that substring with the block’s return value:

s.sub('b') {|match|match.upcase }# => "aBracadabra"

Related: seeConverting to New String.

Source
static VALUErb_str_sub_bang(int argc, VALUE *argv, VALUE str){    VALUE pat, repl, hash = Qnil;    int iter = 0;    long plen;    int min_arity = rb_block_given_p() ? 1 : 2;    long beg;    rb_check_arity(argc, min_arity, 2);    if (argc == 1) {        iter = 1;    }    else {        repl = argv[1];        hash = rb_check_hash_type(argv[1]);        if (NIL_P(hash)) {            StringValue(repl);        }    }    pat = get_pat_quoted(argv[0], 1);    str_modifiable(str);    beg = rb_pat_search(pat, str, 0, 1);    if (beg >= 0) {        rb_encoding *enc;        int cr = ENC_CODERANGE(str);        long beg0, end0;        VALUE match, match0 = Qnil;        struct re_registers *regs;        char *p, *rp;        long len, rlen;        match = rb_backref_get();        regs = RMATCH_REGS(match);        if (RB_TYPE_P(pat, T_STRING)) {            beg0 = beg;            end0 = beg0 + RSTRING_LEN(pat);            match0 = pat;        }        else {            beg0 = BEG(0);            end0 = END(0);            if (iter) match0 = rb_reg_nth_match(0, match);        }        if (iter || !NIL_P(hash)) {            p = RSTRING_PTR(str); len = RSTRING_LEN(str);            if (iter) {                repl = rb_obj_as_string(rb_yield(match0));            }            else {                repl = rb_hash_aref(hash, rb_str_subseq(str, beg0, end0 - beg0));                repl = rb_obj_as_string(repl);            }            str_mod_check(str, p, len);            rb_check_frozen(str);        }        else {            repl = rb_reg_regsub(repl, str, regs, RB_TYPE_P(pat, T_STRING) ? Qnil : pat);        }        enc = rb_enc_compatible(str, repl);        if (!enc) {            rb_encoding *str_enc = STR_ENC_GET(str);            p = RSTRING_PTR(str); len = RSTRING_LEN(str);            if (coderange_scan(p, beg0, str_enc) != ENC_CODERANGE_7BIT ||                coderange_scan(p+end0, len-end0, str_enc) != ENC_CODERANGE_7BIT) {                rb_raise(rb_eEncCompatError, "incompatible character encodings: %s and %s",                         rb_enc_inspect_name(str_enc),                         rb_enc_inspect_name(STR_ENC_GET(repl)));            }            enc = STR_ENC_GET(repl);        }        rb_str_modify(str);        rb_enc_associate(str, enc);        if (ENC_CODERANGE_UNKNOWN < cr && cr < ENC_CODERANGE_BROKEN) {            int cr2 = ENC_CODERANGE(repl);            if (cr2 == ENC_CODERANGE_BROKEN ||                (cr == ENC_CODERANGE_VALID && cr2 == ENC_CODERANGE_7BIT))                cr = ENC_CODERANGE_UNKNOWN;            else                cr = cr2;        }        plen = end0 - beg0;        rlen = RSTRING_LEN(repl);        len = RSTRING_LEN(str);        if (rlen > plen) {            RESIZE_CAPA(str, len + rlen - plen);        }        p = RSTRING_PTR(str);        if (rlen != plen) {            memmove(p + beg0 + rlen, p + beg0 + plen, len - beg0 - plen);        }        rp = RSTRING_PTR(repl);        memmove(p + beg0, rp, rlen);        len += rlen - plen;        STR_SET_LEN(str, len);        TERM_FILL(&RSTRING_PTR(str)[len], TERM_LEN(str));        ENC_CODERANGE_SET(str, cr);        RB_GC_GUARD(match);        return str;    }    return Qnil;}

LikeString#sub, except that:

  • Changes are made toself, not to copy ofself.

  • Returnsself if any changes are made,nil otherwise.

Related: seeModifying.

Source
VALUErb_str_succ(VALUE orig){    VALUE str;    str = rb_str_new(RSTRING_PTR(orig), RSTRING_LEN(orig));    rb_enc_cr_str_copy_for_substr(str, orig);    return str_succ(str);}

Returns the successor toself. The successor is calculated by incrementing characters.

The first character to be incremented is the rightmost alphanumeric: or, if no alphanumerics, the rightmost character:

'THX1138'.succ# => "THX1139"'<<koala>>'.succ# => "<<koalb>>"'***'.succ# => '**+''тест'.succ# => "тесу"'こんにちは'.succ# => "こんにちば"

The successor to a digit is another digit, “carrying” to the next-left character for a “rollover” from 9 to 0, and prepending another digit if necessary:

'00'.succ# => "01"'09'.succ# => "10"'99'.succ# => "100"

The successor to a letter is another letter of the same case, carrying to the next-left character for a rollover, and prepending another same-case letter if necessary:

'aa'.succ# => "ab"'az'.succ# => "ba"'zz'.succ# => "aaa"'AA'.succ# => "AB"'AZ'.succ# => "BA"'ZZ'.succ# => "AAA"

The successor to a non-alphanumeric character is the next character in the underlying character set’s collating sequence, carrying to the next-left character for a rollover, and prepending another character if necessary:

s =0.chr*3# => "\x00\x00\x00"s.succ# => "\x00\x00\x01"s =255.chr*3# => "\xFF\xFF\xFF"s.succ# => "\x01\x00\x00\x00"

Carrying can occur between and among mixtures of alphanumeric characters:

s ='zz99zz99'# => "zz99zz99"s.succ# => "aaa00aa00"s ='99zz99zz'# => "99zz99zz"s.succ# => "100aa00aa"

The successor to an emptyString is a new emptyString:

''.succ# => ""

Related: seeConverting to New String.

Also aliased as:next
Source
static VALUErb_str_succ_bang(VALUE str){    rb_str_modify(str);    str_succ(str);    return str;}

LikeString#succ, but modifiesself in place; returnsself.

Related: seeModifying.

Also aliased as:next!
Source
static VALUErb_str_sum(int argc, VALUE *argv, VALUE str){    int bits = 16;    char *ptr, *p, *pend;    long len;    VALUE sum = INT2FIX(0);    unsigned long sum0 = 0;    if (rb_check_arity(argc, 0, 1) && (bits = NUM2INT(argv[0])) < 0) {        bits = 0;    }    ptr = p = RSTRING_PTR(str);    len = RSTRING_LEN(str);    pend = p + len;    while (p < pend) {        if (FIXNUM_MAX - UCHAR_MAX < sum0) {            sum = rb_funcall(sum, '+', 1, LONG2FIX(sum0));            str_mod_check(str, ptr, len);            sum0 = 0;        }        sum0 += (unsigned char)*p;        p++;    }    if (bits == 0) {        if (sum0) {            sum = rb_funcall(sum, '+', 1, LONG2FIX(sum0));        }    }    else {        if (sum == INT2FIX(0)) {            if (bits < (int)sizeof(long)*CHAR_BIT) {                sum0 &= (((unsigned long)1)<<bits)-1;            }            sum = LONG2FIX(sum0);        }        else {            VALUE mod;            if (sum0) {                sum = rb_funcall(sum, '+', 1, LONG2FIX(sum0));            }            mod = rb_funcall(INT2FIX(1), idLTLT, 1, INT2FIX(bits));            mod = rb_funcall(mod, '-', 1, INT2FIX(1));            sum = rb_funcall(sum, '&', 1, mod);        }    }    return sum;}

Returns a basicn-bitchecksum of the characters inself; the checksum is the sum of the binary value of each byte inself, modulo2**n - 1:

'hello'.sum# => 532'hello'.sum(4)# => 4'hello'.sum(64)# => 532'тест'.sum# => 1405'こんにちは'.sum# => 2582

This is not a particularly strong checksum.

Related: seeQuerying.

Source
static VALUErb_str_swapcase(int argc, VALUE *argv, VALUE str){    rb_encoding *enc;    OnigCaseFoldType flags = ONIGENC_CASE_UPCASE | ONIGENC_CASE_DOWNCASE;    VALUE ret;    flags = check_case_options(argc, argv, flags);    enc = str_true_enc(str);    if (RSTRING_LEN(str) == 0 || !RSTRING_PTR(str)) return str_duplicate(rb_cString, str);    if (flags&ONIGENC_CASE_ASCII_ONLY) {        ret = rb_str_new(0, RSTRING_LEN(str));        rb_str_ascii_casemap(str, ret, &flags, enc);    }    else {        ret = rb_str_casemap(str, &flags, enc);    }    return ret;}

Returns a string containing the characters inself, with cases reversed:

  • Each uppercase character is downcased.

  • Each lowercase character is upcased.

Examples:

'Hello World!'.swapcase# => "hELLO wORLD!"'тест'.swapcase# => "ТЕСТ"

Some characters (and even character sets) do not have casing:

'12345'.swapcase# => "12345"'こんにちは'.swapcase# => "こんにちは"

The casing may be affected by the givenmapping; seeCase Mapping.

Related: seeConverting to New String.

Source
static VALUErb_str_swapcase_bang(int argc, VALUE *argv, VALUE str){    rb_encoding *enc;    OnigCaseFoldType flags = ONIGENC_CASE_UPCASE | ONIGENC_CASE_DOWNCASE;    flags = check_case_options(argc, argv, flags);    str_modify_keep_cr(str);    enc = str_true_enc(str);    if (flags&ONIGENC_CASE_ASCII_ONLY)        rb_str_ascii_casemap(str, str, &flags, enc);    else        str_shared_replace(str, rb_str_casemap(str, &flags, enc));    if (ONIGENC_CASE_MODIFIED&flags) return str;    return Qnil;}

LikeString#swapcase, except that:

  • Changes are made toself, not to copy ofself.

  • Returnsself if any changes are made,nil otherwise.

Related: seeModifying.

Source
static VALUEstring_to_c(VALUE self){    VALUE num;    rb_must_asciicompat(self);    (void)parse_comp(rb_str_fill_terminator(self, 1), FALSE, &num);    return num;}

Returnsself interpreted as aComplex object; leading whitespace and trailing garbage are ignored:

'9'.to_c# => (9+0i)'2.5'.to_c# => (2.5+0i)'2.5/1'.to_c# => ((5/2)+0i)'-3/2'.to_c# => ((-3/2)+0i)'-i'.to_c# => (0-1i)'45i'.to_c# => (0+45i)'3-4i'.to_c# => (3-4i)'-4e2-4e-2i'.to_c# => (-400.0-0.04i)'-0.0-0.0i'.to_c# => (-0.0-0.0i)'1/2+3/4i'.to_c# => ((1/2)+(3/4)*i)'1.0@0'.to_c# => (1+0.0i)"1.0@#{Math::PI/2}".to_c# => (0.0+1i)"1.0@#{Math::PI}".to_c# => (-1+0.0i)

Returns Complex zero if the string cannot be converted:

'ruby'.to_c# => (0+0i)

SeeKernel#Complex.

Source
static VALUErb_str_to_f(VALUE str){    return DBL2NUM(rb_str_to_dbl(str, FALSE));}
Returns the result of interpreting leading characters in +self+ as a Float:  '3.14159'.to_f  # => 3.14159  '1.234e-2'.to_f # => 0.01234Characters past a leading valid number are ignored:  '3.14 (pi to two places)'.to_f # => 3.14Returns zero if there is no leading valid number:  'abcdef'.to_f # => 0.0

SeeConverting to Non-String.

Source
static VALUErb_str_to_i(int argc, VALUE *argv, VALUE str){    int base = 10;    if (rb_check_arity(argc, 0, 1) && (base = NUM2INT(argv[0])) < 0) {        rb_raise(rb_eArgError, "invalid radix %d", base);    }    return rb_str_to_inum(str, base, FALSE);}

Returns the result of interpreting leading characters inself as an integer in the givenbase (which must be in (0, 2..36)):

'123456'.to_i# => 123456'123def'.to_i(16)# => 1195503

Withbase zero, stringobject may contain leading characters to specify the actual base:

'123def'.to_i(0)# => 123'0123def'.to_i(0)# => 83'0b123def'.to_i(0)# => 1'0o123def'.to_i(0)# => 83'0d123def'.to_i(0)# => 123'0x123def'.to_i(0)# => 1195503

Characters past a leading valid number (in the givenbase) are ignored:

'12.345'.to_i# => 12'12345'.to_i(2)# => 1

Returns zero if there is no leading valid number:

'abcdef'.to_i# => 0'2'.to_i(2)# => 0
Source
# File ext/json/lib/json/add/string.rb, line 32defto_json_raw(...)to_json_raw_object.to_json(...)end

This method creates aJSON text from the result of a call toto_json_raw_object of thisString.

Source
# File ext/json/lib/json/add/string.rb, line 21defto_json_raw_object  {JSON.create_id=>self.class.name,"raw"=>unpack("C*"),  }end

This method creates a raw object hash, that can be nested into other data structures and will be generated as a raw string. This method should be used, if you want to convert raw strings toJSON instead of UTF-8 strings, e. g. binary data.

Source
static VALUEstring_to_r(VALUE self){    VALUE num;    rb_must_asciicompat(self);    num = parse_rat(RSTRING_PTR(self), RSTRING_END(self), 0, TRUE);    if (RB_FLOAT_TYPE_P(num) && !FLOAT_ZERO_P(num))        rb_raise(rb_eFloatDomainError, "Infinity");    return num;}

Returns the result of interpreting leading characters instr as a rational. Leading whitespace and extraneous characters past the end of a valid number are ignored. Digit sequences can be separated by an underscore. If there is not a valid number at the start ofstr, zero is returned. This method never raises an exception.

'  2  '.to_r#=> (2/1)'300/2'.to_r#=> (150/1)'-9.2'.to_r#=> (-46/5)'-9.2e2'.to_r#=> (-920/1)'1_234_567'.to_r#=> (1234567/1)'21 June 09'.to_r#=> (21/1)'21/06/09'.to_r#=> (7/2)'BWV 1079'.to_r#=> (0/1)

NOTE: “0.3”.to_r isn’t the same as 0.3.to_r. The former is equivalent to “3/10”.to_r, but the latter isn’t so.

"0.3".to_r==3/10r#=> true0.3.to_r==3/10r#=> false

See alsoKernel#Rational.

Source
static VALUErb_str_to_s(VALUE str){    if (rb_obj_class(str) != rb_cString) {        return str_duplicate(rb_cString, str);    }    return str;}

Returnsself ifself is aString, orself converted to aString ifself is a subclass ofString.

Also aliased as:to_str
Alias for:to_s
Alias for:intern
Source
static VALUErb_str_tr(VALUE str, VALUE src, VALUE repl){    str = str_duplicate(rb_cString, str);    tr_trans(str, src, repl, 0);    return str;}

Returns a copy ofself with each character specified by stringselector translated to the corresponding character in stringreplacements. The correspondence ispositional:

  • Each occurrence of the first character specified byselector is translated to the first character inreplacements.

  • Each occurrence of the second character specified byselector is translated to the second character inreplacements.

  • And so on.

Example:

'hello'.tr('el','ip')#=> "hippo"

Ifreplacements is shorter thanselector, it is implicitly padded with its own last character:

'hello'.tr('aeiou','-')# => "h-ll-"'hello'.tr('aeiou','AA-')# => "hAll-"

Argumentsselector andreplacements must be valid character selectors (seeCharacter Selectors), and may use any of its valid forms, including negation, ranges, and escaping:

# Negation.'hello'.tr('^aeiou','-')# => "-e--o"# Ranges.'ibm'.tr('b-z','a-z')# => "hal"# Escapes.'hel^lo'.tr('\^aeiou','-')# => "h-l-l-"    # Escaped leading caret.'i-b-m'.tr('b\-z','a-z')# => "ibabm"     # Escaped embedded hyphen.'foo\\bar'.tr('ab\\','XYZ')# => "fooZYXr"   # Escaped backslash.
Source
static VALUErb_str_tr_bang(VALUE str, VALUE src, VALUE repl){    return tr_trans(str, src, repl, 0);}

LikeString#tr, but modifiesself in place. Returnsself if any changes were made,nil otherwise.

Source
static VALUErb_str_tr_s(VALUE str, VALUE src, VALUE repl){    str = str_duplicate(rb_cString, str);    tr_trans(str, src, repl, 1);    return str;}

LikeString#tr, but also squeezes the modified portions of the translated string; returns a new string (translated and squeezed).

'hello'.tr_s('l','r')#=> "hero"'hello'.tr_s('el','-')#=> "h-o"'hello'.tr_s('el','hx')#=> "hhxo"

Related:String#squeeze.

Source
static VALUErb_str_tr_s_bang(VALUE str, VALUE src, VALUE repl){    return tr_trans(str, src, repl, 1);}

LikeString#tr_s, but modifiesself in place. Returnsself if any changes were made,nil otherwise.

Related:String#squeeze!.

Source
static VALUEstr_undump(VALUE str){    const char *s = RSTRING_PTR(str);    const char *s_end = RSTRING_END(str);    rb_encoding *enc = rb_enc_get(str);    VALUE undumped = rb_enc_str_new(s, 0L, enc);    bool utf8 = false;    bool binary = false;    int w;    rb_must_asciicompat(str);    if (rb_str_is_ascii_only_p(str) == Qfalse) {        rb_raise(rb_eRuntimeError, "non-ASCII character detected");    }    if (!str_null_check(str, &w)) {        rb_raise(rb_eRuntimeError, "string contains null byte");    }    if (RSTRING_LEN(str) < 2) goto invalid_format;    if (*s != '"') goto invalid_format;    /* strip '"' at the start */    s++;    for (;;) {        if (s >= s_end) {            rb_raise(rb_eRuntimeError, "unterminated dumped string");        }        if (*s == '"') {            /* epilogue */            s++;            if (s == s_end) {                /* ascii compatible dumped string */                break;            }            else {                static const char force_encoding_suffix[] = ".force_encoding(\""; /* "\")" */                static const char dup_suffix[] = ".dup";                const char *encname;                int encidx;                ptrdiff_t size;                /* check separately for strings dumped by older versions */                size = sizeof(dup_suffix) - 1;                if (s_end - s > size && memcmp(s, dup_suffix, size) == 0) s += size;                size = sizeof(force_encoding_suffix) - 1;                if (s_end - s <= size) goto invalid_format;                if (memcmp(s, force_encoding_suffix, size) != 0) goto invalid_format;                s += size;                if (utf8) {                    rb_raise(rb_eRuntimeError, "dumped string contained Unicode escape but used force_encoding");                }                encname = s;                s = memchr(s, '"', s_end-s);                size = s - encname;                if (!s) goto invalid_format;                if (s_end - s != 2) goto invalid_format;                if (s[0] != '"' || s[1] != ')') goto invalid_format;                encidx = rb_enc_find_index2(encname, (long)size);                if (encidx < 0) {                    rb_raise(rb_eRuntimeError, "dumped string has unknown encoding name");                }                rb_enc_associate_index(undumped, encidx);            }            break;        }        if (*s == '\\') {            s++;            if (s >= s_end) {                rb_raise(rb_eRuntimeError, "invalid escape");            }            undump_after_backslash(undumped, &s, s_end, &enc, &utf8, &binary);        }        else {            rb_str_cat(undumped, s++, 1);        }    }    RB_GC_GUARD(str);    return undumped;invalid_format:    rb_raise(rb_eRuntimeError, "invalid dumped string; not wrapped with '\"' nor '\"...\".force_encoding(\"...\")' form");}

Returns an unescaped version ofself:

s_orig ="\f\x00\xff\\\""# => "\f\u0000\xFF\\\""s_dumped =s_orig.dump# => "\"\\f\\x00\\xFF\\\\\\\"\""s_undumped =s_dumped.undump# => "\f\u0000\xFF\\\""s_undumped==s_orig# => true

Related:String#dump (inverse ofString#undump).

Source
static VALUErb_str_unicode_normalize(int argc, VALUE *argv, VALUE str){    return unicode_normalize_common(argc, argv, str, id_normalize);}

Returns a copy ofself withUnicode normalization applied.

Argumentform must be one of the following symbols (seeUnicode normalization forms):

  • :nfc: Canonical decomposition, followed by canonical composition.

  • :nfd: Canonical decomposition.

  • :nfkc: Compatibility decomposition, followed by canonical composition.

  • :nfkd: Compatibility decomposition.

The encoding ofself must be one of:

  • Encoding::UTF_8

  • Encoding::UTF_16BE

  • Encoding::UTF_16LE

  • Encoding::UTF_32BE

  • Encoding::UTF_32LE

  • Encoding::GB18030

  • Encoding::UCS_2BE

  • Encoding::UCS_4BE

Examples:

"a\u0300".unicode_normalize# => "a""\u00E0".unicode_normalize(:nfd)# => "a "

Related:String#unicode_normalize!,String#unicode_normalized?.

Source
static VALUErb_str_unicode_normalize_bang(int argc, VALUE *argv, VALUE str){    return rb_str_replace(str, unicode_normalize_common(argc, argv, str, id_normalize));}

LikeString#unicode_normalize, except that the normalization is performed onself.

RelatedString#unicode_normalized?.

Source
static VALUErb_str_unicode_normalized_p(int argc, VALUE *argv, VALUE str){    return unicode_normalize_common(argc, argv, str, id_normalized_p);}

Returnstrue ifself is in the givenform of Unicode normalization,false otherwise. Theform must be one of:nfc,:nfd,:nfkc, or:nfkd.

Examples:

"a\u0300".unicode_normalized?# => false"a\u0300".unicode_normalized?(:nfd)# => true"\u00E0".unicode_normalized?# => true"\u00E0".unicode_normalized?(:nfd)# => false

Raises an exception ifself is not in a Unicode encoding:

s ="\xE0".force_encoding(Encoding::ISO_8859_1)s.unicode_normalized?# Raises Encoding::CompatibilityError.

Related:String#unicode_normalize,String#unicode_normalize!.

Source
# File pack.rb, line 23defunpack(fmt,offset:0)Primitive.attr!:use_blockPrimitive.pack_unpack(fmt,offset)end

Extracts data fromself.

Ifblock is not given, forming objects that become the elements of a new array, and returns that array. Otherwise, yields each object.

SeePacked Data.

Source
# File pack.rb, line 33defunpack1(fmt,offset:0)Primitive.pack_unpack1(fmt,offset)end

LikeString#unpack, but unpacks and returns only the first extracted object. SeePacked Data.

Source
static VALUErb_str_upcase(int argc, VALUE *argv, VALUE str){    rb_encoding *enc;    OnigCaseFoldType flags = ONIGENC_CASE_UPCASE;    VALUE ret;    flags = check_case_options(argc, argv, flags);    enc = str_true_enc(str);    if (case_option_single_p(flags, enc, str)) {        ret = rb_str_new(RSTRING_PTR(str), RSTRING_LEN(str));        str_enc_copy_direct(ret, str);        upcase_single(ret);    }    else if (flags&ONIGENC_CASE_ASCII_ONLY) {        ret = rb_str_new(0, RSTRING_LEN(str));        rb_str_ascii_casemap(str, ret, &flags, enc);    }    else {        ret = rb_str_casemap(str, &flags, enc);    }    return ret;}

Returns a string containing the upcased characters inself:

s ='Hello World!'# => "Hello World!"s.upcase# => "HELLO WORLD!"

The casing may be affected by the givenmapping; seeCase Mapping.

Related:String#upcase!,String#downcase,String#downcase!.

Source
static VALUErb_str_upcase_bang(int argc, VALUE *argv, VALUE str){    rb_encoding *enc;    OnigCaseFoldType flags = ONIGENC_CASE_UPCASE;    flags = check_case_options(argc, argv, flags);    str_modify_keep_cr(str);    enc = str_true_enc(str);    if (case_option_single_p(flags, enc, str)) {        if (upcase_single(str))            flags |= ONIGENC_CASE_MODIFIED;    }    else if (flags&ONIGENC_CASE_ASCII_ONLY)        rb_str_ascii_casemap(str, str, &flags, enc);    else        str_shared_replace(str, rb_str_casemap(str, &flags, enc));    if (ONIGENC_CASE_MODIFIED&flags) return str;    return Qnil;}

Upcases the characters inself; returnsself if any changes were made,nil otherwise:

s ='Hello World!'# => "Hello World!"s.upcase!# => "HELLO WORLD!"s# => "HELLO WORLD!"s.upcase!# => nil

The casing may be affected by the givenmapping; seeCase Mapping.

Related:String#upcase,String#downcase,String#downcase!.

Source
static VALUErb_str_upto(int argc, VALUE *argv, VALUE beg){    VALUE end, exclusive;    rb_scan_args(argc, argv, "11", &end, &exclusive);    RETURN_ENUMERATOR(beg, argc, argv);    return rb_str_upto_each(beg, end, RTEST(exclusive), str_upto_i, Qnil);}

With a block given, calls the block with eachString value returned by successive calls toString#succ; the first value isself, the next isself.succ, and so on; the sequence terminates when valueother_string is reached; returnsself:

'a8'.upto('b6') {|s|prints,' ' }# => "a8"

Output:

a8a9b0b1b2b3b4b5b6

If argumentexclusive is given as a truthy object, the last value is omitted:

'a8'.upto('b6',true) {|s|prints,' ' }# => "a8"

Output:

a8a9b0b1b2b3b4b5

Ifother_string would not be reached, does not call the block:

'25'.upto('5') {|s|fails }'aa'.upto('a') {|s|fails }

With no block given, returns a new Enumerator:

'a8'.upto('b6')# => #<Enumerator: "a8":upto("b6")>
Source
static VALUErb_str_valid_encoding_p(VALUE str){    int cr = rb_enc_str_coderange(str);    return RBOOL(cr != ENC_CODERANGE_BROKEN);}

Returnstrue ifself is encoded correctly,false otherwise:

"\xc2\xa1".force_encoding(Encoding::UTF_8).valid_encoding?# => true"\xc2".force_encoding(Encoding::UTF_8).valid_encoding?# => false"\x80".force_encoding(Encoding::UTF_8).valid_encoding?# => false