Some of the Unicode apostrophe-like characters are not handled as expected - Java java.lang

Java examples for java.lang:String Unicode

Description

Some of the Unicode apostrophe-like characters are not handled as expected

Demo Code


//package com.java2s;

import java.util.regex.Pattern;

public class Main {
    public static void main(String[] argv) throws Exception {
        String s = "java2s.com";
        System.out.println(standardizeApostrophes(s));
    }//  w w  w . jav a  2 s .  c  o m

    static Pattern nonStandardApostrophes_pattern = null;

    /**
     * Some of the Unicode apostrophe-like characters are not handled as expected by the
     * ICUFoldingFilter (they are removed rather than folded into standard apostrophes).
     * This causes search discrepancies. See DISCOVERYACCESS-1084, DISCOVERYACCESS-1408.
     * This method is only for values intended for searching and not display (incl. facets).
     */
    public static String standardizeApostrophes(String s) {
        if (nonStandardApostrophes_pattern == null)
            nonStandardApostrophes_pattern = Pattern
                    .compile("[\u02bb\u02be\u02bc\u02b9\u02bf]");
        if (s == null)
            return null;
        return nonStandardApostrophes_pattern.matcher(s).replaceAll("'");
    }
}

Related Tutorials