Java String Accent normalizeByRemovingAccent(final String string)

Here you can find the source of normalizeByRemovingAccent(final String string)

Description

Same treatment as the one of #normalize(String) but removes also the accented characters.

License

Open Source License

Parameter

Parameter Description
string the string to normalize. There is no guarantee when the string is not encoded into UTF8.

Return

the normalized string.

Declaration

public static String normalizeByRemovingAccent(final String string) 

Method Source Code

//package com.java2s;
/*/*  www . ja  v  a  2 s. co  m*/
 * Copyright (C) 2000 - 2018 Silverpeas
 *
 * This program is free software: you can redistribute it and/or modify
 * it under the terms of the GNU Affero General Public License as
 * published by the Free Software Foundation, either version 3 of the
 * License, or (at your option) any later version.
 *
 * As a special exception to the terms and conditions of version 3.0 of
 * the GPL, you may redistribute this Program in connection with Free/Libre
 * Open Source Software ("FLOSS") applications as described in Silverpeas's
 * FLOSS exception.  You should have received a copy of the text describing
 * the FLOSS exception, and it is also available here:
 * "https://www.silverpeas.org/legal/floss_exception.html"
 *
 * This program is distributed in the hope that it will be useful,
 * but WITHOUT ANY WARRANTY; without even the implied warranty of
 * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
 * GNU Affero General Public License for more details.
 *
 * You should have received a copy of the GNU Affero General Public License
 * along with this program.  If not, see <http://www.gnu.org/licenses/>.
 */

import java.text.Normalizer;

public class Main {
    /**
     * Same treatment as the one of {@link #normalize(String)} but removes also the accented
     * characters.
     * @param string the string to normalize. There is no guarantee when the string is not encoded
     * into UTF8.
     * @return the normalized string.
     */
    public static String normalizeByRemovingAccent(final String string) {
        String normalized = string;
        if (normalized != null) {
            // separating all of the accent marks from the characters
            normalized = Normalizer.normalize(normalized, Normalizer.Form.NFD);
            // removing accent
            normalized = normalized.replaceAll("\\p{InCombiningDiacriticalMarks}+", "");
        }
        return normalized;
    }

    /**
     * Normalizes the given string (which must be encoded into UTF-8) in order that the result
     * contains only unified chars.
     * <p>Indeed, according to the environment of the user, sometimes it is sent data with
     * combined characters which will make the server have a bad behavior, like throw an error on
     * file download.</p>
     * @param string the string to normalize. There is no guarantee when the string is not encoded
     * into UTF8.
     * @return the normalized string.
     */
    public static String normalize(final String string) {
        String normalized = string;
        if (normalized != null) {
            normalized = Normalizer.normalize(normalized, Normalizer.Form.NFC);
        }
        return normalized;
    }
}

Related

  1. equalsIgnoreAccents(String lhs, String rhs, Locale locale)
  2. equalsIgnoreAccentsAndCase(String s1, String s2)
  3. equalsIgnoreCaseAndAccent(String string1, String string2, Locale locale)
  4. equalsStringIgnoringAccents(String str1, String str2)
  5. getDeAccentLoweredChars(String word)
  6. removeAccent(String s)
  7. removeAccent(String s)
  8. removeAccent(String strIn)
  9. removeAccents(final String s)