jsFind - generate index for jsFind using B-Tree
use jsFind; my $t = new jsFind(B => 4); my $f = 1; foreach my $k (qw{minima ut dolorem sapiente voluptatem}) { $t->B_search(Key => $k, Data => { "path" => { t => "word $k", f => $f }, }, Insert => 1, Append => 1, ); }
This module can be used to create index files for jsFind, powerful tool for adding a search engine to a CDROM archive or catalog without requiring the user to install anything.
Main difference between this module and scripts delivered with jsFind are:
You can also examine examples which come as tests with this module,
for example t/04words.t
.
jsFind
is mode implementing methods which you, the user, are going to
use to create indexes.
Create new tree. Arguments are B
which is maximum numbers of keys in
each node and optional Root
node. Each root node may have child nodes.
All nodes are objects from jsFind::Node
.
my $t = new jsFind(B => 4);
Search, insert, append or replace data in B-Tree
$t->B_search( Key => 'key value', Data => { "path" => { "t" => "title of document", "f" => 99, }, }, Insert => 1, Append => 1, );
Semantics:
If key not found, insert it iff Insert
argument is present.
If key is found, replace existing data iff Replace
argument
is present or add new datum to existing iff Append
argument is present.
Return B (maximum number of keys)
my $max_size = $t->B;
Returns root node
my $root = $t->root;
Returns if node is overfull
if ($node->node_overfull) { something }
Returns your tree as formatted string.
my $text = $root->to_string;
Mostly usefull for debugging as output leaves much to be desired.
Create Graphviz graph of your tree
my $dot_graph = $root->to_dot;
Create xml index files for jsFind. This should be called after your B-Tree has been filled with data.
$root->to_jsfind('/full/path/to/index/dir/');
Returns number of nodes in created tree.
There is also longer version if you want to recode your data charset into different one (probably UTF-8):
$root->to_jsfind('/full/path/to/index/dir/','ISO-8859-2','UTF-8');
Destination encoding is UTF-8 by default, so you don't have to specify it.
$root->to_jsfind('/full/path/to/index/dir/','WINDOWS-1250');
This is internal function to recode charset.
It will also try to decode entities in data using HTML::Entities.
Each node has k
key-data pairs, with B
<= k
<= 2B
, and
each has k+1
subnodes, which might be null.
The node is a blessed reference to a list with three elements:
($keylist, $datalist, $subnodelist)
each is a reference to a list list.
The null node is represented by a blessed reference to an empty list.
Create New node
my $node = new jsFind::Node ($keylist, $datalist, $subnodelist);
You can also mit argument list to create empty node.
my $empty_node = new jsFind::Node;
Locate key in node using linear search. This should probably be replaced by binary search for better performance.
my ($found, $index) = $node->locate_key($key, $cmp_coderef);
Argument $cmp_coderef
is optional reference to custom comparison
operator.
Returns (1, $index) if $key[$index] eq $key.
Returns (0, $index) if key could be found in $subnode[$index].
In scalar context, just returns 1 or 0.
Creates new empty node
$node = $root->emptynode; $new_node = $node->emptynode;
Test if node is empty
if ($node->is_empty) { something }
Return $i
th key from node
my $key = $node->key($i);
Return $i
th data from node
my $data = $node->data($i);
Set key data pair for $i
th element in node
$node->kdp_replace($i, "key value" => { "data key 1" => "data value 1", "data key 2" => "data value 2", };
Insert key/data pair in tree
$node->kdp_insert("key value" => "data value");
No return value.
Adds new data keys and values to $i
th element in node
$node->kdp_append($i, "key value" => { "added data key" => "added data value", };
Set new or return existing subnode
# return 4th subnode my $my_node = $node->subnode(4);
# create new subnode 5 from $my_node $node->subnode(5, $my_node);
Test if node is leaf
if ($node->is_leaf) { something }
Return number of keys in the node
my $nr = $node->size;
Split node into two halves so that keys 0 .. $n-1
are in one node
and keys $n+1 ... $size
are in the other.
my ($left_node, $right_node, $kdp) = $node->halves($n);
Dumps tree as string
my $str = $root->to_string;
Recursivly walk nodes of tree
Escape <, >, & and ", and to produce valid XML
Create jsFind xml files
my $nr=$tree->to_jsfind('/path/to/index','0');
Returns number of elements created
jsFind web site http://www.elucidsoft.net/projects/jsfind/
B-Trees in perl web site http://perl.plover.com/BTree/
Mark-Jonson Dominus <mjd@pobox.com> wrote BTree.pm
which was
base for this module
Shawn P. Garbett <shawn@elucidsoft.net> wrote jsFind
Dobrica Pavlinusic <dpavlin@rot13.org> wrote this module
Copyright (C) 2004 by Dobrica Pavlinusic
This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; either version 2 of the License, or (at your option) any later version. This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.