Remote Backend

This document describes how to make use of the facilities in Xapian for distributed searches.

Overview

There are two sides to the distributed searching. The client end is the program initiating the search on behalf of a user, and the server end is the program which provides a searching interface over a set of databases for the client. There can be many servers, with many clients sharing them. In theory, a server can also be a client to other servers, but this may not be very useful or efficient.

The client runs queries in the same way that it would on local databases, but with different database arguments. Instead of type, eg, "quartz", use "remote". The extra parameters describe how to connect to the server. Once the database is opened, the query process is identical to any other. Using a stub database with autobackend is a good way to wrap up access to a remote database in a neat way.

The remote backend currently support two client/server methods: prog and tcp. They both use the same protocol, although different means to contact the server.

The Prog Method

The prog method spawns a program when the database is opened, and communicates with it over a Unix domain socket. This isn't intended for production use, but is useful for debugging and testing. The omprogsrv program, which is designed to be the program spawned, currently only uses text files indexed into inmemory databases.

From the client end, the following database parameters are used:

The TCP Method

The tcp method uses TCP/IP sockets to connect to a running server on a remote (or indeed local) machine.

The client's database parameters are:

The server is omtcpsrv in the netprogs/ directory of the xapian-core distribution. This should be started and left running in the background before searches are performed.

The arguments omtcpsrv currently knows are:

--port PORTNUM
(required) the port to listen on.
--one-shot
Handle one connection, and then exit. If --one-shot is not used, then the server runs until it is killed manually.
--idle-timeout MSECS
Set the timeout on a idle connection.
--active-timeout MSECS
Set the timeout waiting for responses when the connection is active.
--timeout MSECS
Set the idle and active timeouts to the same value.
--quiet
Minimal output.
One or more databases need to be specified by listing their directories - they are opened using the "auto" pseudo-backend.

Once started, the server will run and listen for connections on the configured port, currently handling them one by one (although this will change at some point).

Notes

A remote search should behave just like the equivalent local one, with the exception of a few currently unimplemented features (e.g. match decision functors).

Exceptions are propagated across the link and thrown again at the client end.