Java String Normalize normalize(String s)

Description

normalize

License

Open Source License

Declaration

public static String normalize(String s)

Method Source Code

//package com.java2s;

import java.text.Normalizer;

import java.util.regex.Pattern;

public class Main {
    private static Pattern diacriticalMarksPattern = Pattern
            .compile("\\p{InCombiningDiacriticalMarks}");

    public static String normalize(String s) {
        // Normalizes string and strips diacritics (map to ascii) by
        // 1. taking the NFKD (compatibility decomposition -
        //   in compatibility equivalence, formatting such as subscripting is lost -
        //   see http://unicode.org/reports/tr15/)
        // 2. Removing diacriticals
        // 3. Recombining into NFKC form (compatibility composition)
        // This process may be slow.
        ////from  w  w w. j ava  2s .c  o  m
        // The main purpose of the function is to remove diacritics for asciis,
        //  but it may normalize other stuff as well.
        // A more conservative approach is to do explicit folding just for ascii character
        //   (see RuleBasedNameMatcher.normalize)
        String d = Normalizer.normalize(s, Normalizer.Form.NFKD);
        d = diacriticalMarksPattern.matcher(d).replaceAll("");
        return Normalizer.normalize(d, Normalizer.Form.NFKC);
    }
}

Java String Normalize normalize(String s)

Description

License

Declaration

Method Source Code

Related