mrbgems/mruby-encoding/README.md
This mrbgem provides a lightweight, "poorman's" encoding functionality for mruby. It is designed to offer basic encoding support, primarily focused on UTF-8 and ASCII-8BIT.
Encoding::UTF_8Encoding::ASCII_8BIT (aliased as Encoding::BINARY)This gem introduces an Encoding module and extends the String and Integer classes with encoding-related methods.
Encoding ModuleA module (not a class, unlike standard Ruby) that holds encoding constants.
Encoding::UTF_8: Represents the UTF-8 encoding.Encoding::ASCII_8BIT: Represents the ASCII-8BIT encoding.Encoding::BINARY: An alias for Encoding::ASCII_8BIT.String Methodsstring.valid_encoding? -> true or false
true if the string is correctly encoded (particularly useful for UTF-8 strings). For ASCII-8BIT strings, it generally returns true.string.encoding -> EncodingConstant
Encoding::UTF_8 or Encoding::BINARY.string.force_encoding(encoding_name) -> string
encoding_name (e.g., "UTF-8", "ASCII-8BIT", "BINARY").ArgumentError if an unsupported encoding name is provided.Integer Methodinteger.chr(encoding_name = Encoding::BINARY) -> String
encoding_name is "UTF-8", the integer is treated as a Unicode codepoint.encoding_name is "ASCII-8BIT" or "BINARY" (the default), the integer is treated as a byte value (0-255).RangeError if the integer is out of the valid range for the specified encoding.ArgumentError for unknown encoding names.# main.rb
if __ENCODING__ == "UTF-8"
s = "helloあ"
puts s.encoding #=> Encoding::UTF_8
puts s.valid_encoding? #=> true
s2 = "\xff".force_encoding("UTF-8")
puts s2.valid_encoding? #=> false
s3 = "world"
s3.force_encoding("BINARY")
puts s3.encoding #=> Encoding::BINARY
puts s3.valid_encoding? #=> true (ASCII-8BIT strings are generally considered valid)
puts 65.chr #=> "A" (defaults to ASCII-8BIT)
puts 230.chr("UTF-8") #=> "æ" (if U+00E6 is æ)
# For mruby, this might be different based on actual UTF-8 char mapping
# For example, 12354.chr("UTF-8") might be "あ"
else
s = "hello"
puts s.encoding #=> Encoding::BINARY (or ASCII-8BIT)
# Attempting to force to UTF-8 in a non-UTF-8 mruby build might be limited
# or behave as ASCII-8BIT depending on mruby's core string handling.
end
# Force encoding
my_string = "\xE3\x81\x82" # UTF-8 bytes for "あ"
puts my_string.encoding # Might be BINARY by default if not created as UTF-8 literal
my_string.force_encoding("UTF-8")
puts my_string.encoding #=> Encoding::UTF_8
puts my_string #=> あ
invalid_utf8 = "\xff\xfe"
invalid_utf8.force_encoding("UTF-8")
puts invalid_utf8.valid_encoding? #=> false
# Integer#chr
puts 65.chr # => "A"
puts 65.chr("BINARY") # => "A"
# When mruby is compiled with MRB_UTF8_STRING
if Object.const_defined?(:MRB_UTF8_STRING)
puts 12354.chr("UTF-8") # => "あ"
# puts 0x110000.chr("UTF-8") #=> RangeError
end