The metadata service
-----------------------------------------------------------

The metadata service is a powerful combination of an indexer and a search
engine for all kinds of documents.  It combines several technologies, such
as PostgreSQL and Zope, to create a contextually rich database which can
be queried in any number of forms.

The metadata service is part of the Search services project.


WHAT IS IT?

The metadata service is, at its core, a multipurpose indexer, designed to
index information from different sources (at this point, only the file system
is indexed).  Applications can query the metadata service using a standards-
based protocol and query language.  There is one known application at the
moment, named Quick search, which uses the metadata service to quickly locate
files.

The metadata service boasts the following features:
* Extensibility: new infospace agents can be plugged in, so indexing of
  logs, PIM information, bookmarks, Web cache, and others is possible.
  The filesystem agents is also extensible, with a plugin interface to add
  support for all kinds of files.
* Resource control: indexing speed adjusts automatically according to system
  load.  This keeps the computer usable even while indexing large amounts of
  files.
* Real-time indexing: the filesystem agent will, if available, use the
  inotify file notification system to reindex changed files.  These files
  have priority over regular disk scans, so modified files will show up
  quickly in search results.
* High performance: preliminary tests show that the filesystem agent can
  index in excess of 10 files per second, which makes searching through an
  entire computer's information a reality.
* High security: the metadata service has been developed entirely in Python,
  a powerful and exploit-resistant language, with automatic garbage
  collection and exceptions, which help in resource management.
  The standard code base does not evaluate any user-supplied code, and
  trusts only controlled C-based implementations, such as the python-inotify
  module, the GNOME VFS module, and, of course, the C implementation of
  Python itself.  Together, these facts suggest that the service is suitable
  for running as a long-running system-level daemon.
  WARNING: at the moment, no filtering is applied to search results, so
  inaccessible files' paths may be revealed to any user with a search
  tool.  This is being worked upon and the solution is expected to be
  rather quick, but other checklist items have priority.  DO NOT RUN IN
  PRODUCTION SETTINGS OR WITH SENSITIVE DATA.

PREREQUISITES

This software requires:
- Python
- the Python GTK+ 2 bindings, and GNOME VFS bindings (for better MIME detection)
- a MySQL server installation (a preexisting database cluster or
  running server is not required)
- python-mytools, available from the same place as this software
- Zope's standalone ZODB3 3.3.1 or later
- /sbin/blkid, from the e2fsprogs package
- python-UnixSocketTransport, for the msctl script

Optionally, enabling inotify in your OS and installing the python-inotify
will automatically enable real-time indexing.


INSTALLATION

Please read the INSTALL file.


FEEDBACK: REPORTING BUGS, ETC.

Don't forget to visit this project's Web site first:

   http://www.amautacorp.com/staff/Rudd-O/projects/search-services/

to solve any doubts or questions you might have.


LICENSE AND LEGAL NOTICE

This software is under the GPL.  See the file COPYING for licensing
information.  Contact us if you need us to license this software under
a different license.
s


HOW TO USE IT

Documentation is a bit sparse right now.  Read the INSTALL file for
installation instructions.

To run:
   metadata-service

or
   msctl start

or, if you are root
   /sbin/chkconfig --add ms
   /sbin/service ms start

To stop the service:
   killall metadata-service

or the sane way
   msctl stop

or the administrative way
   /sbin/service ms stop

Remember to install ZODB first.  If you installed
them in a nonstandard path, point your PYTHONPATH to the ZODB's
lib/python dir, in order to let the metadata service find ZODB.
Apply the same mechanism if you installed the metadata service 
in a nonstandard path as well.

CONFIG AND USAGE NOTES:

Check $installdir/etc/metadata-service.conf for a primer in
available config options.

The database will be created in by default in
$installdir/var/lib/metadata-service-database, and will grow
very large in number of files, very quickly.

Make sure you have lots of free space in /, or mount an empty
volume in that directory (e.g. you might create a large, empty
file, make a filesystem on it, and mount it -o loop).

The default target load average is 1.0.  This means that once
the load average goes above 1.0 (plus a notch), the metadata
service will scale back in indexing speed.  That should make for
a responsive system while indexing, but if you want indexing to
go faster, be my guest and change it in MetadataService.py.
You can change the setting at runtime with:

   msctl setload (float value between 0.5 and 10)
