Content on this page requires a newer version of Adobe Flash Player.

Get Adobe Flash player

 

KSearch & KIndexer

Know more about us...

 

KSearch & KIndexer

 

Overview

 

Arabic is like no other language!

Arabic has a unique beauty, poetry and logic. Yet, it is not possible to make easy and meaningful searches in Arabic through standard internet search engines – until now!

 

Why can't you make easy searches in Arabic?

Traditional search engines used for English, for example, use wildcards to search for words. This may be appropriate for English, but it is deficient in searching through Arabic text. The reason stems from the complexity of Arabic morphology. While an English word typically has five or six inflections - changes in the form of the word giving it extra meaning - Arabic words can have up to 10,000 inflections. The morphology of a language is a basic component in the way search engines operate. In order to conduct proper online search on Arabic text, it is thus necessary to process the text according to Arabic language’s own morphology.

 

Are there any Search engines which suit Arabic?

Yes!!! We give you KSearch – The Arabic Search Engine by alKhawarizmy!!

KSearch is capable of indexing and searching through Arabic websites, as well as documents’ intranets. KSearch uses a fast Arabic morphological analyzer which allows the user to search for all Arabic word inflections, using morphological rules ("Morphological Search"). In order to ensure the accuracy of each search, KSearch also process text through the meaning of each word as well as wildcard search, stem search, Boolean search.

Included in KSearch is KIndexer, a fast and efficient indexer which is capable of indexing Arabic words in documents and databases using morphological rules, as well as indexing the word form "as is", for words that do not have an Arabic origin.

 

Features

KSearch can find all the inflected forms of an Arabic word. It also allows the user to search for words related to a particular meaning, by selecting that meaning, and then displays the relevant search results. The following outlines the main features of KSearch/KIndexer:

  • Arabic Morphological Search:
    If a user were to search for the word "اجتماع" (meeting), the search results would contain various inflected forms, such as "اجتمع" ([he] met), "يجتمعون" ([they] meet), etc. Traditional engines that do not have morphological search might, at most, retrieve words that have "اجتماع" (a meeting) as part of the word, such as "اجتماعهم" ([their] meeting), "واجتماع" ([and] a meeting).
  • Differentiation between Word Meanings:
    If there is more than one meaning to an input query word, the user may choose the meaning he wishes to search for. The search results will largely contain the inflected forms of the word, that belong to that meaning. This helps reduce the redundancy that results from morphological search only.
  • Search using Logical Operators:
    In addition to the "All Words" and "Any Words" search types, the system includes "Logical Search" or "Boolean Search". Logical Search allows users to search using either exact phrases, or by using the logical operators AND, OR, NOT. It also allows users to specify word adjacency (proximity), specifying a number of intervening words, and whether the words are in order of user input or not.
  • Search using Wildcards:
    The user can search for proper nouns of non-Arabic origin, using wildcards; wildcard search enables the user to search for words that have not been linguistically processed, such as foreign and non-Arabic words. The capability exists to search for these words 'as is', as well as words that look similar, by specifying certain 'wildcard' characters; the wildcards supported are ? (denoting a single Arabic character) and * (denoting any number of Arabic characters). Thus if the words جورج - لجورج – بجورج were found in the text, they would not be processed linguistically, but the user would still be able to retrieve them all, by specifying جورج*, for example.
  • Search words are highlighted in the results pages:
    This is very important for Arabic, in particular, as it relieves the user of having to additionally search for all the inflected forms of a search word on a page; the lack of this feature would render the search system useless.
  • The following document formats are supported:
    MS Office, HTML, TXT, RTF, PDF, as well as UNICODE encoded documents.
  • The Arabic dictionary that supports KSearch is a comprehensive dictionary of contemporary Arabic (Modern Standard Arabic)
    That includes up-to-date words used in the various media. This dictionary is based upon the published dictionary "A Dictionary of the Contemporary Arabic Language", by Prof. Ahmed Mokhtar Omar, the late renowned Arabic lexicographer.
  • KIndexer Fast Indexing Engine that uses 64 bit Technology:
    KSearch includes a fast indexing engine for the various file formats supported, as well as for the databases supported. The indexing rate reaches speeds of 50,000 words/sec., on a desktop PC, equipped with an Intel Core2 Duo 2.33GHz processor, 1GB of memory and a SATA hard disk drive. In addition, the indexing engine uses 64 bit technology, which does not limit the size of the index generated; 32 bit technology limits the index size to 4GB.
  • Comprehensive Index Management:
    The indexing system includes comprehensive index management, that allows a user to divide groups of documents or web pages into separate indexes, for flexible management. The system also provides the capability of deleting, updating and merging indexes, as well as the deletion/addition of files or folders from/to indexes, respectively.

Editions:


  • Database Edition:
    In this edition, the website's database is indexed (using the KIndexer engine) and the indexes are stored on the site's server. Users can then search via a browser, which will use the KSearch search engine installed on the server. The indexer comes either with or without a browser based interface.
  • Document Edition:
    In this edition, a company's documents are indexed on a centralized server connected to the company's intranet. Anyone on the company’s network may search these documents via a browser, through an internal web application that accesses the KSearch search engine. The indexer comes either with or without a browser-based user interface.
  • Desktop Standard and Professional Edition:
    This edition works on a single Personal Computer; the user's documents are indexed (using an non browser user interface) on the user's PC. The search user interface is also non-browser based.
 
 
All Rights Reserved. www.AlKhawarizmy.com
Valid XHTML 1.0 Strict | Valid CSS 3.0