Main
RiSearch Pro v.3.2 Manual
© С. Тарасов
Advanced search
Sometimes you need more than just standard search script.
Suppose you have website with articles archive and you want to search by articles author,
subject or publishing date. Or you want to sort results by author or publishing date.
Usually this can be done using relational database (such as MySQL), but not always
you have enough time (money) to reorganize your site.
RiSearch can help you up to some extent. In advanced search each document
can be presented as text of document plus several additional attributes
(short text strings or numbers, such as "author", "title", "subject", "date of publication"),
which can be indexed together with document body or used in comparison operations.
Configuration
To use advanced search capabilities you have to configure script,
by default this option is disabled. First, you need to define attribute
using two strings, text between these strings will be used as attributes value.
For example, you can use additional meta-tags:
<META NAME="Author" CONTENT="Ivanov Ivan">
<META NAME="Subject" CONTENT="biology">
<META NAME="Date" CONTENT="22/03/2020">>
For such example in configuration file you need to write
(attributes numbers are started from zero):
$attr_def[0] = [ '<META NAME="Author" CONTENT="', '">' ];
$attr_def[1] = [ '<META NAME="Subject" CONTENT="', '">' ];
$attr_def[2] = [ '<META NAME="Date" CONTENT="', '">' ];
$attr_def[3] = [ '<META NAME="Price" CONTENT="', '">' ];
Also there are two additional attributes:
$attr_def[4] = [ 'SIZE' ]; - size of document.
$attr_def[5] = [ 'DATE' ]; - document's last modified date.
If in your case impossible to define attributes using
this approach (for example they are stored in separate file),
you have to rewrite lib::common_lib::get_attribute() procedure.
Now you need to specify how these attributes will be indexed.
Several options are available:
- 1 - attribute will be indexed just as normal text.
- 2 - attributes value will be stored together as document's description
and later can be printed in search results.
- 4 - attribute is integer number. Comparison operations can be applied
to such attributes (so you can search for documents where certain attributes value
is larger than 1000).
- 8 - attribute is float number. As for integer attributes, comparison operations
can be used with float attributes. Please note, that index for float attributes
will be not transferable between different platforms.
- 16 - attribute is a string with date. Script will try to recognize
date and store it in different format, so that comparison operations
will be possible. Script can recognize date in next formats:
23.03.98, 23/03/98, 23.03.1998, Mar 23 1998.
Other formats can be added if necessary. Day/month order in date (day first or month first)
is defined by date_format parameter in configuration file.
For each attribute you have to choose required options and put in config
sum in such format:
$attr_conf[0] = 1+2;
$attr_conf[1] = 1+2;
$attr_conf[2] = 1+2+16;
$attr_conf[3] = 8;
$attr_conf[4] = 4;
For text attribute you can define weight, which will be used
in document's sorting by relevance.
$attr_weight[0] = 2;
$attr_weight[1] = 1;
$attr_weight[2] = 0;
Query language
New parameters were added to query string for use with new features.
- All old parameters will work as before, so you don't have to change
anything in your old search forms.
- To search in specific attributes you have to add attributes numbers in query string
in such way: "&a=1&a=3" (similar to search in zones). For example, if you want
to find all papers by Smith, next query should be used:
search.pl?q=smith&a=0
- To find one word in one attribute and another word in another attribute,
such queries can be used:
search.pl?q0=smith&q1=biology - all papers of Smith in biology subject
search.pl?q0=smith&q2=1998 - all papers of Smith during year 1998
(if parameter "q=word" is added, this word will be searched in all indexed
attributes).
- Comparison operations for numeric attributes:
- a4lt=10 - value of attribute 4 is less then 10;
- a4le=10 - value of attribute 4 is less or equal 10;
- a4gt=10 - value of attribute 4 is greater then 10;
- a4ge=10 - value of attribute 4 is greater or equal 10;
- a4eq=10 - value of attribute 4 equal 10;
- a4ne=10 - value of attribute 4 not equal 10;
For example, all papers by Smith with price less then 10 USD:
search.pl?q0=smith&a3lt=10
If several comparison operations are used in one query, they will be processed
using "AND" logic ("a1lt=10&a2gt=10" means attribute 1 should be less then 10 and
attribute 2 greater then 10). You can add parameter "ctype=OR" to change logic to "OR"
("a1lt=10&a2gt=10&ctype=OR").
For date attributes comparison operations should be used in next form:
search.pl?q0=smith&a2gt=01.01.2000
Also you can specify day, month and year separatelly:
search.pl?q0=ivanov&y2gt=2000&m2gt;=01&d2gt;=01
Another examples:
"y2eq=2000" - all articles for year 2000;
"y2eq=2000&m2eq;=2" - all articles for February 2000.
Results sorting
Results can be sorted according to some attribute.
For this parameter "s" should be used (for example
"search.pl?q0=smith&a2gt=01.01.2000&s=2").
Please note that sorting by text attribute can take quite long time
if many matches were found.
You can sort results by several attributes simultaneously
(works only with text attributes). For example:
"q=word&s;=2_3" - results will be sorted by attribute 2
and by attribute 3, if attribute 2 is equal.
Results output
Attributes stored in document's description (option "2" in attributes configuration)
can be printed in results page as URL or title. For this you have to
use such string in template - %attr_N% (where N is attribute number).
|