The html_entities() function converts characters that are illegal in HTML, such as &, <, and ", into their safe equivalents: &, <, and ", respectively.
PHP html_entities() Function has the following syntax.
string html_entities ( string html [, int options [, string charset]] )
Parameter | Description |
---|---|
html | the html string to convert |
options | A bitmask of flags |
charset | defines encoding used in conversion. Default value is ISO-8859-1 prior to PHP 5.4.0, and UTF-8 from PHP 5.4.0 onwards. You are highly encouraged to specify the correct value for your code. |
options is a bitmask of one or more of the following flags, which specify how to handle quotes, invalid code unit sequences and the used document type. The default is ENT_COMPAT | ENT_HTML401.
Available flags constants
Constant Name | Description |
---|---|
ENT_COMPAT | Will convert double-quotes and leave single-quotes alone. |
ENT_QUOTES | Will convert both double and single quotes. |
ENT_NOQUOTES | Will leave both double and single quotes unconverted. |
ENT_IGNORE | Silently discard invalid code unit sequences instead of returning an empty string. Using this flag is discouraged as it ? may have security implications. |
ENT_SUBSTITUTE | Replace invalid code unit sequences with a Unicode Replacement Character U+FFFD (UTF-8) or FFFD; (otherwise) instead of returning an empty string. |
ENT_DISALLOWED | Replace invalid code points for the given document type with a Unicode Replacement Character U+FFFD (UTF-8) or FFFD; (otherwise) instead of leaving them as is. This may be useful, for instance, to ensure the well-formedness of XML documents with embedded external content. |
ENT_HTML401 | Handle code as HTML 4.01. |
ENT_XML1 | Handle code as XML 1. |
ENT_XHTML | Handle code as XHTML. |
ENT_HTML5 | Handle code as HTML 5. |
The following character sets are supported:
Charset | Aliases | Description |
---|---|---|
ISO-8859-1 | ISO8859-1 | Western European, Latin-1. |
ISO-8859-5 | ISO8859-5 | Little used cyrillic charset (Latin/Cyrillic). |
ISO-8859-15 | ISO8859-15 | Western European, Latin-9. Adds the Euro sign, French and Finnish letters missing in Latin-1 (ISO-8859-1). |
UTF-8 | NoAlias | ASCII compatible multi-byte 8-bit Unicode. |
cp866 | ibm866, 866 | DOS-specific Cyrillic charset. |
cp1251 | Windows-1251, win-1251, 1251 | Windows-specific Cyrillic charset. |
cp1252 | Windows-1252, 1252 | Windows specific charset for Western European. |
KOI8-R | koi8-ru, koi8r | Russian. |
BIG5 | 950 | Traditional Chinese, mainly used in Taiwan. |
GB2312 | 936 | Simplified Chinese, national standard character set. |
BIG5-HKSCS | NoAlias | Big5 with Hong Kong extensions, Traditional Chinese. |
Shift_JIS | SJIS, SJIS-win, cp932, 932 | Japanese |
EUC-JP | EUCJP, eucJP-win | Japanese |
MacRoman | Charset that was used by Mac OS. | |
'' | NoAlias | An empty string activates detection from script encoding (Zend multibyte), default_charset and current locale (see nl_langinfo() and setlocale()), in this order. Not recommended. |
PHP html_entities() function returns the escaped string.
We can reverse this conversion using the html_entity_decode() function.
Convert string to HTML friendly
<?PHP
$title = "java2s.com & PHP";
$safe = htmlentities($title);
echo $safe;
?>
The code above generates the following result.