The syntax you use to query
Indexing Service depends
entirely on how you are doing it.
Basically, all queries submitted to Indexing Service with the ixsso interface take
the form of a string (queries with the ado providers for VB.Net and C#.Net or C++.Net
use SQL syntax, but its still a string). It can be as simple as :-
Dog and Cat
(finds all documents with the words Dog and Cat in them)
All the way to…..
(@write > 2000/01/01) and (@ filename animals) and (@contents {weight value=1.0}
Dog {near dist = 200, unit = word} {weight value=0.5} Cat not Hot)
(finds all documents on the system that contain the words ‘Cat’ and ‘Dog’ but not
the word ‘Hot’, and that have the word ‘animals’ in the filename, and that have
been modified since the 1st Jan 2000. Also treat the word ‘Dog’ as twice as important
as ‘Cat’)
The last sample is far to complicated for most end users to work with, so it is
the developers job to come up with a way of getting the functionality of the second
example by having a user type in the words in the first example and possibly choose
a couple of items from a menu.
Ensure Indexing Service
is running, right click on you’re my computer icon, then click on ‘Manage’. Expand
out ‘Services and Applications’ then expand out ‘Indexing Service’. Expand out one of your catalog's,
then click on “Query the Catalog. The Indexing Service Query Form will now load.
You can use this to try some queries of your own.
If you are using Windows 2000 or later then your installation of
Indexing Service should
operate in free text mode by default (e.g. dialect 2). That means that if you entered
the words “Dog Cat” as a query indexing service would just give you back all documents
that contained the words Dog and/or Cat, rating them by how many times the words
appeared in each document.
(Dialect 1 is now obsolete, it is the syntax that was used pre windows 2000
Indexing Service. Under
Dialect 1 a search for “Cat Dog” would of only given you documents with that exact
phrase in. E.G. it operated in phrase mode by default, not free text mode like Dialect
2)
To submit a query programmatically to Indexing Service a developer has to build up a string
within their application, submit the query, and then process the results. As already
seen the string can just be a couple of key words or an awful lot more.
If you just want to type in keywords
then you can just type away and hit 'Search'. However, if you want to build a query
string by hand that will involve querying a documents properties as well as its
contents then I recommend you put each section into a block encased by brackets
and joined by an AND, OR or NOT .
E.G. (@contents dog cat )
and (@write -1m)
A good front end application like
CISearch will do this for you, so all you have to do is type in
your keywords / query text and choose some options from a few drop down box's.
If the application is particularly
clever like
CISearch then it will even optimise the query
for you ensuring you always receive the results you want.
Some examples of operators you can use include ( ‘---‘
indicates variable content)
(@contents -------- ) = contents
of the document must comply with this section
keywords only e.g. = (@contents dog cat )
keywords joined by operators ( and , or , not , near), e.g. = (@contents dog and
cat )
An exact phrase encased in quotes e.g. (@contents "dog cat” )
Keywords joined by the ‘near’ operator with extended criteria = (@contents dog {near
dist = 200, unit = word} cat )
(@ filename ------ ) = filename must contain these characters
similar to @contents
(@write ------ ) = modified date must match this criteria
Dates must either be in the form
of YY/MM/DD, or YYYY/MM/DD.
Documents modified in the last 7 days = (@write -7d)
Documents modified in the last week = (@write -1w)
Documents modified in the last month = (@write -1m)
Documents modified in the last year = (@write -1y)
Documents modified since 2001/01/01 = (@write > 2001/01/01)
Documents modified before 2001/01/01 = (@write < 2001/01/01)
Documents modified between 2001/01/01 and the 2002/01/01 = (@write > 2001/01/01
& < 2002/01/01)
(@size ------ ) = file size must match this criteria
Size greater than 1000000 bytes = (@size > 1000000)
Size less than 2000000 bytes = (@size < 2000000)
Size between 1000000 bytes and 2000000 bytes = (@size > 1000000 and < 2000000
)
(@docauthor ------)
= document author must match this criteria
Document author must match 'Fred
Smith' = (@docauthor Fred Smith)
AND, NOT, OR,
NEAR :-
While you won't find these words
in an
Indexing Service catalog as they are noise
words, they are so much more than that. They can radically alter a query and the
entire way it executes.. take the following for example.
Dog and Cat
(finds all documents with the words Dog and Cat in them)
Dog or Cat
(finds all documents with the words Dog and/or Cat in them)
Dog not Cat
(finds all documents with the word 'Dog' in as long as they do not have 'Cat' in
them)
Now that is fairly simple, but it
is about to get interesting.
Dog near Cat
That would return all documents where
both Dog and Cat were present AND were within 150 words of each other. Ranking is
then not only done on how many times the words appear, but how close they are to
each other is also taken into account.
Dog {near dist = 200, unit = word}
Cat
This would override the defaults
for 'near' and instead of 150 words being the max distance between the keywords,
200 would be.
AND, OR and NOT can all be used to
both join keywords, exact phrases, and date criteria etc together inside blocks
, as well as to join together different blocks.
NEAR should only ever be used to
join together Keywords and exact phrases inside a contents block.
|