|
|||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
java.lang.Objectnet.sf.zekr.engine.search.SearchUtils
public class SearchUtils
This file contains several useful public static
methods for finding occurrences of a source text in
another text. Since the Arabic language has some diacritics, there is also functions to ignore or match
diacritics.
Field Summary |
---|
Fields inherited from interface net.sf.zekr.engine.search.ArabicCharacters |
---|
ALEF, ALEF_HAMZA_ABOVE, ALEF_HAMZA_BELOW, ALEF_MADDA, ALEF_MAKSURA, ALEF_WASLA, ARABIC_KAF, ARABIC_QUESION_MARK, ARABIC_YEH, BARREE_YEH, DAMMA, DAMMATAN, FARSI_KEHEH, FARSI_YEH, FATHA, FATHATAN, HAMZA, HAMZA_ABOVE, HAMZA_BELOW, KASRA, KASRATAN, MADDA, MADDAH_ABOVE, RUB_EL_HIZB, SAJDA_PLACE, SHADDA, SMALL_HIGH_MEEM, SMALL_LOW_SEEN, SMALL_ROUNDED_ZERO, SMALL_WAW, SMALL_YEH, SUKUN, SUPERSCRIPT_ALEF, SWASH_KEHEH, TATWEEL, TEH, TEH_MARBUTA, WAQF_HIGH_SEEN, WAQF_JEEM, WAQF_LA, WAQF_QALA, WAQF_SALA, WAQF_SMALL_MEEM, WAQF_THREE_DOT, WAW, WAW_HAMZA_ABOVE, YEH_HAMZA_ABOVE |
Constructor Summary | |
---|---|
SearchUtils()
|
Method Summary | |
---|---|
static java.lang.String |
arabicSimplify(java.lang.String str)
This method removes specific diacritics form the string. |
static java.lang.String |
arabicSimplify4AdvancedSearch(java.lang.String str)
This method removes specific diacritics form the string, and also replaces Hamza characters with their base character. |
static Range |
indexOfIgnoreDiacritic(java.lang.String src,
java.lang.String key,
boolean matchCase,
java.util.Locale locale)
Will find a Range of the first occurrence of key in src . |
static Range |
indexOfMatchDiacritic(java.lang.String src,
java.lang.String key,
boolean matchCase,
java.util.Locale locale)
Will find a range of the first occurrence of key in src . |
static boolean |
isDiac(char ch)
These characters are Arabic Harakets (diacritics): Sukun Shadda Fatha Kasra Damma Fathatan Kasratan Dammatan Superscript alef |
static java.lang.String |
replaceLayoutSimilarCharacters(java.lang.String str)
Replace Farsi unicode Yeh with Arabic one, and so about Kaf (Farsi
Keheh ). |
static java.lang.String |
replaceSimilarArabic(java.lang.String str)
Replace similar arabic characters which are used commonly instead of others. |
static java.lang.String |
simplifyAdvancedSearchQuery(java.lang.String query)
|
Methods inherited from class java.lang.Object |
---|
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
Constructor Detail |
---|
public SearchUtils()
Method Detail |
---|
public static java.lang.String replaceLayoutSimilarCharacters(java.lang.String str)
Yeh
with Arabic one, and so about Kaf
(Farsi
Keheh
).
str
-
String
resultpublic static java.lang.String replaceSimilarArabic(java.lang.String str)
str
-
String
resultpublic static java.lang.String arabicSimplify(java.lang.String str)
replaceLayoutSimilarCharacters()
.
str
- the string to be simplified
str
public static java.lang.String arabicSimplify4AdvancedSearch(java.lang.String str)
str
- string to be simplified
str
public static java.lang.String simplifyAdvancedSearchQuery(java.lang.String query)
public static boolean isDiac(char ch)
ch
- the character to be examined
true
if ch is an Arabic Harakat, otherwise false
public static Range indexOfIgnoreDiacritic(java.lang.String src, java.lang.String key, boolean matchCase, java.util.Locale locale)
Range
of the first occurrence of key
in src
. This method
will ignore diacritics on both src
and key
strings.
src
- source string to be searched onkey
- non-null
target string to be found the first occurrence of which on the src
stringmatchCase
- specifies whether to search in a case sensitive manner or notlocale
- the text locale (for casing conversion)
Range
object from the previous space character just before the key
(or
start of the source string if no space found) to the first space just after the key
in
src
(or end of src if no space found)public static Range indexOfMatchDiacritic(java.lang.String src, java.lang.String key, boolean matchCase, java.util.Locale locale)
key
in src
. This method will consider
diacritics on both src
and key
.
src
- source string to be searched onkey
- target string which is to search first occurrence of which on src
matchCase
- specifies whether to search in a case sensitive manner or notlocale
- the text locale (for casing conversion)
Range
object from the previous space character just before the key
(or
start of the source string if no space found) to the first space just after the key
in
src
(or end of src if no space found)
|
|||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |