Setting the $KCODE global variable to 'u' and loading jcode gives regular expressions awareness of UTF-8 characters
$KCODE = 'u' require 'jcode' "".scan(/./) do |character| puts character end