Main
RiSearch v.1.0 Manual
© S. Tarasov
Results sorting
Found documents can be sorted by relevance, document modification date,
document size or returned in the order they appear in database.
Each sorting type can be turned "On" or "Off" in configuration file.
By default script will store all necessary information for all types
of sorting. Please note, that many webservers do not return "Last modified"
date for dynamic pages, and time of indexation will be used instead.
Relevance
Relevance is an abstract measure of how well a document satisfies
the user's query. RiSearch uses number of asked words as a base for
relevance calculation. Each word occurrence in document gives one point
to this document. Certain areas in document (such as title, headings,
links, and words written in bold, italic) can have higher rating.
This can be set up in configuration file. Then calculated points number is
normalized according to the most frequent term for this document
(so that longest document not necessary gets highest rating).
Document rating for given term then can be modified according
to number of documents with this term. In results most common
words will have little effect in documents sorting.
Configuration parameters
allow_sort_by_rating => 1,
- turns ON the page rating calculation. Additionally, you may sort the documents
by last_modified date and size
(allow_sort_by_date, allow_sort_by_size).
weight_title => 5,
- you may control the weight of the word depending on it's position on page.
Each occurrence of word increases the word rating by 1. You may choose another
weight for words occurred in page TITLE, heading, metatags, and links to other pages...
word_freq => 1,
- turns ON the document score normalization according to word frequency
(common word will get lower rating). Can be turned OFF and ON without site reindexing.
|