Priority Difficulty Area Task Release Owner
Update the PLATFORMS file * James
make sure the non-autogenerated docs are kept up-to-date * James
High 3 API Put Xapian into its own namespace, and remove Om from classnames; fix omparsequery and OmQueryParser to have more similar names; make xapian.h the header, make om/om.h a compat. header with a stack of #define-s so old code keeps working short term (e.g. #define OmEnquire Xapian::Enquire)? Sort out final names for database factory functions. 0.7 Olly
Medium Fix up examples and make sure they are actually instructive. Add a comment to each describing what it demonstrates. I've made a start. delve is a reasonable example. msearch probably needs simplifying to just do a probabilistic search, or to use OmQueryParser. Add example to copy quartz database as "Full compaction with revision 1" (and perhaps delete/rename the bitmaps) as described in the quartz docs. This should produce a small fast database optimised for fast searching. 0.8 Olly
Medium indexgraph -> extra (needs to build as a support library?) [James expressed an interest in this as dbtools needs it] 0.8 James
Medium 3 API Provide fake term (empty termname) which indexes all documents, thus providing a clean way to iterate through them. This would be used for a real "NOT" operator. Olly has a patch which mostly implements this for the InMemory backend. 0.8 Olly
Medium 2 General Check for zero byte cleanness wherever strings are used. There are a number of c_str()s in the code, but I believe all in the core library (excluding the bindings) are harmless at 2002-04-29. There may be other zero byte issues though. xapian-applications/dbtools also uses c_str() where it should probably use data() and length(). 0.8
Medium 3 Quartz Make quartz database autoflush when enough changes have been performed based on the memory used up as a proportion of that available, rather than simply when a count of changes is reached. Remove hardcoded count of 1000 changes. 0.8
Medium API Consider default ctors for any API classes which are missing them. 1.0 Olly
Medium 3 Databases Change all internal references to net/network backend to remote backend (in step with external naming) 1.0
Medium 5 Documentation Ensure that API documentation covers entirety of API (i.e. that all methods and classes in the API have documentation comments) -- see doxygen generated file docs/doxygen_api_warnings for a list of undocumented methods. Then read through generated API docs, and rewrite doc comments to improve clarity and make them more coherent. 1.0 Olly
Medium 4 General Allow setting of the document length in OmDocument? (Currently defined to be the sum of the wdfs). 1.0
Medium 2 OmQuery Move all serialisation of OmQuery into OmQuery (out of socketcommon.cc and localmatch): modification of omquery requires changes in 3 separate parts of the code, at present. 1.0
Medium 5 Porting Produce Microsoft Windows version, probably cross-compiling to mingw. 1.0 James
Medium 2 Quartz Ensure that quartz databases don't have a problem if there is no positional information entry available for a term / document combination. 1.0
Low 3 Documentation Add notes about catching exceptions throughout userman, particularly in examples (eg, search engine example) 1.0
Low 4 General Allow user written backends? Be good to allow them to register themselves automatically at runtime (or linktime perhaps) to replace current conditional compilation scheme. Do this using sub-classing and factory classes? A bit like the weighting schemes. 1.0
Low Quartz Shouldn't stall just because a stale db_lock exists - instead of just an empty file, put the hostname and pid in the file (or use a symlink with the info in the target since that can be created atomically) and check the details - that way we can spot a stale lock from a process on the same machine. Or touch the lock periodically to keep it? Or use fcntl(), except that doesn't handle locking within a process, so it needs to be combined with another locking scheme, which I think boils down to needing thread locks to work in a multi-threaded process... 1.0 Olly
.deb built, control files via autoconf 1.0 Olly