|
|
|
|
CISearch.Net, the browser based fle search engine. |
CISearch.Net
uses Microsoft's Indexing Service to build catalogs of the files on your computer
system, and then makes those catalogs fully searchable via users browsers through
ASP.Net based search forms.
Whay not have a look at the amazing benefits of CISearch.Net
yourself?

|
|
 |
|
Microsoft Indexing Service Ifilters |
Ifilters are basically plug-in's
for
Indexing Service that tell
Indexing
Service how to Index (e.g.. read the
content and properties) a type of file as determined by the files file extension.
Indexing service calls the appropriate Ifilter for a file, when it comes across
it during a sweep of a file system.
Ifilters are integrated into
Indexing Service
using the Ifilter interface. They are normally written in C++ as they require rather
low level programming, and C++ is designed for exactly that kind of work.
In reality an Ifilter is just a DLL file that contains the code necessary to tell
Indexing Service how to handle a specific
file type. This isn’t strictly true though as you could say that the Ifilter itself
is responsible for extracting the content from the file in a round about way.
Basically, an Ifilter understands the structure and the layout of a given file type
and knows how to extract textual content from it. To Index a new type of file you
simply create and implement a new Ifilter.
Do you really need a new Ifilter for that strange file type?
Indexing Service can still
Index a file in most cases even if it does not have an Ifilter loaded for that file
type.
If the file contains text content then there is a VERY good chance
Indexing Service
can still Index it. You do this by telling Indexing service
to “Index files with unknown extensions”.
Right click on your “My Computer” Icon and click “Manage”
Expand out “Services and Applications”
Right click on “Indexing Service” then click on “Properties”
Tick the “Index files with unknown extensions” box.
Click “Ok”, then restart Indexing Service
Now Indexing Service will try to index pretty much every file it comes across.
Finding out which Ifilter DLL is used to filter a file type
The following example, for
HTML files, shows how to find the filter DLL for a document.
Determine the persistent handler registered for an extension. Check to see whether
the extension for the type of files that the DLL filters has a persistent handler
registered under the registry entry
\ HKEY_LOCAL_MACHINE \ SOFTWARE \ Classes. Let
this be <Value1>.
\HKEY_LOCAL_MACHINE\SOFTWARE\Classes
.htm
PersistentHandler
{EEC97550-47A9-11CF-B952-00AA0051FE20}
If this entry exists, skip to step 4 and use <Value1> there.
Alternatively, determine the CLSID. If there is not a persistent handler registered
for the extension, find the CLSID associated with the document type under the registry
entry \ HKEY_LOCAL_MACHINE \ SOFTWARE \ Classes. Let this be <Value2>.
\ HKEY_LOCAL_MACHINE \ SOFTWARE \ Classes
htmlfile
= Class for WWW HTML files
CLSID
= {25336920-03F9-11CF-8FD0-00AA00686F13}
Determine the persistent handler. Using <Value2>determined in Step 2, find
the PersistentHandler value for the \HKEY_LOCAL_MACHINE \ SOFTWARE \ Classes \ CLSID
\ <Value2> entry. Let this be <Value3>.
\ HKEY_LOCAL_MACHINE \ SOFTWARE \ Classes \
CLSID
{25336920-03F9-11CF-8FD0-00AA00686F13}
= WWW HTML files
PersistentHandler
= {EEC97550-47A9-11CF-B952-00AA0051FE20}
Determine the IFilter persistent handler GUID. Using <Value1> determined in
Step 1 or <Value3> determined in Step 3, find the IFilter Persistent Handler
GUID for the document type. The value under the registry entry \ HKEY_LOCAL_MACHINE
\ SOFTWARE \ Classes \ CLSID \<Value1 or 3>\ PersistentAddinsRegistered \
89BCB740-6119-101A-BCB7-00DD010655AF yields the IFilter Persistent Handler GUID
for this document type.
Let this be <Value4>. 89BCB740-6119-101A-BCB7-00DD010655AF is the IFilter
interface GUID.
\ HKEY_LOCAL_MACHINE \ SOFTWARE \ Classes \
CLSID
{EEC97550-47A9-11CF-B952-00AA0051FE20}
= REG_SZ HTML File Persistent Handler
PersistentAddinsRegistered
{89BCB740-6119-101A-BCB7-00DD010655AF}
= REG_SZ {E0CA5340-4534-11CF-B952-00AA0051FE20}
Determine the filter DLL. Using <Value4> determined in Step 4, the filter
DLL can be found under the entry \ HKEY_LOCAL_MACHINE \ SOFTWARE \ Classes \ CLSID
\ <Value4> \ InprocServer32.
\ HKEY_LOCAL_MACHINE \ SOFTWARE \ Classes\CLSID
{E0CA5340-4534-11CF-B952-00AA0051FE20}
= REG_SZ HTML Filter
InprocServer32
= REG_SZ nlhtml.dll
For further information on Ifilters
see the
Ifilter section on MSDN.
|
|