Does MOSS support Arabic Search?
This is the question I always receive from customers and partners. My answer always starts with the following:
The question should not be if MOSS does or doesn’t support Arabic Search, cause the answer would be simply – YES, IT DOES.
The question should be: how powerful is MOSS Search when it comes to Arabic language?
I would like to pin-point a couple of whitepapers that is a must read for this subject. Afterwards, I’m going to go over the configuration steps required to reach the desired results.
Two whitepapers produced by our product teams that elaborate on multi-language support in Microsoft Office SharePoint Server 2007:
- Building Multilingual Sites white paper: Talks about the overall features for building multilingual sites like language packs, site variations, and search components.
- Arabic Word Breaker white paper, you can see that we provide morphological analysis for Arabic, with a good set of features.
Mike Taghizadeh’s blog covers MOSS 2007 Search Capabilities. Two excellent blogs are worth reading about word stemming:
Here are some facts to summarize the above:
- Word breaker is the component of MOSS Search that does stemming.
- Word stemming (morphology) is composed of two things: morphological analysis, and morphological generation.
- In turn, morphological analysis/generation is further composed of two things: inflectional, and derivational.
- Word breakers for different languages come shipped with MOSS 2007. They are NOT part of language packs.
- Stemming is off by default for Arabic (and some other languages). You need to enable stemming (at query time) from the Search Center. [see below]
- Stemming for a specific language is triggered by the language used in the client browser. [see below]
Enabling word stemming in MOSS 2007 – Search Center:
- Being the owner of administrator of the site, go to Search Center results page. (you can issue a query to go there, or navigate to results.aspx page)
- Edit the page: Site Actions -> Edit Page
- Got to the Core Results Web Part, and choose to modify this web part
- In the Web Part settings panel, check the option that reads "Enable word stemming …" under the Results Query options.
- Click OK, and save the page or publish it.
Now you have your search ready for word stemming at query time.
Testing Arabic word stemming by setting Arabic as the default language in IE:
MOSS 2007 has been designed to choose word breakers according to the language of the client browser. Browser will send HTTP_ACCEPT_LANGUAGE, with the default language set to MOSS. MOSS will in turn invoke word breaker for that specific language.
So to trigger Arabic word breaker, you need to change the IE language settings (Tools -> Internet Options -> Languages). If Arabic is set as the default, MOSS will invoke Arabic Word Breaker, and will do the stemming at query time. See the multilingual whitepaper above from more details.