Tuesday, January 4, 2011

Zebra Server -Index searcher (Z3950 protocol)



Introductions:

A little story for this techno-server development,after eprints was set-up into a commercial grade repository the applications,customization and etc. We need some application services to to build along with eprints to port its records from another back up or a new storage server , and we must have marc21 protocol running into it. After marc21 was successfully installed/rebuild with eprints (thanks to the third party software-still an open source ), a need to harvest metadata is a must -that is polling the records /indexes to a remote new depository server - a harvester services; thus replicating a metadata from a remote eprints server . A triggering applications for a service engine to pulled indexes records and the solution to make it happen completely is non other than Zebra...

Zebra is a high-performance, general-purpose structured text indexing and retrieval engine. It reads structured records in a variety of input formats (eg. email, XML, MARC) and allows access to them through exact boolean search expressions and relevance-ranked free-text queries.

Zebra supports large databases (more than ten gigabytes of data, tens of millions of records). It supports incremental, safe database updates on live systems. You can access data stored in Zebra using a variety of Index Data tools (eg. YAZ and PHP/YAZ) as well as commercial and freeware Z39.50 clients and tool kits.

Zebra is free software, available under the GPL license. It may be used by anyone without charge. If you wish to incorporate Zebra into a commercial software distribution, please contact us about alternative licenses.

Notice: Optional, commercial support is available for Zebra.

Requirements:
Depository
Eprints (example with MARC(XML/ISO) protocol)
Koha

Methodology:
Download this prerequisite files:
(incase you need a source build-rpm)
root@localhost# yum install gnutls-devel
root@localhost# yum install libicu-devel


For Centos Operating system
Download yaz src rpm(to date installation)
root@localhost# wget http://download.fedora.redhat.com/pub/epel/5/SRPMS/yaz-2.1.54-1.el5.1.src.rpm
root@localhost# rpmbuild --rebuild yaz*src.rpm
root@localhost# cd /usr/src/redhat/RPMS/i386/(in my case-Centos!)



root@localhost# rpm -ivh libyaz-2*
root@localhost# rpm -ivh yaz-2*
root@localhost# rpm -ivh libyaz-dev*

For Fedora Operating sytem
root@localhost# yum install libyaz libyaz-devel
root@localhost# perl -MCPAN -e 'install "MARC::Record"
root@localhost# perl -MCPAN -e 'install "MARC::Charset"
root@localhost# perl -MCPAN -e 'install "MARC::File::XML"
root@localhost# perl -MCPAN -e 'install "XML::Simple"
root@localhost# pelr-MCPAN -e 'install "Znet::Z3950::ZOOM"

Tar installations
Note:**
Since I was using an old Centos5 the best idZebra version which suits to version 2.44(idzebra-2.44.tar.gz).You can download it in the Zebra file/source archive.
Download idzebra-xxx.tar.gz file
Unzip and untar
tar zxvf idzebra-xxx.tar.gz
cd idzebra-xxx
./configure –help
./configure
Make
Make install

Or an options if you preferred directory
If you are not the system administrator, you can run zebra server from you home directory.
Run the following commands
root@localhost# tar zxvf idzebra-xxx.tar.gz
root@localhost# cd idzebra-xxx
root@localhost# ./configure –prefix=$HOME
root@localhost# make
root@localhost# make install

Directory Structure will be in..
$HOME/bin
Will have zebraidx and zebrasrv
$HOME/include
Include files
$HOME/share
Manual pages

Testing with MARC21
1) Goto the directory
root@localhost# cd idzebra-xxx/test/usmarc
You should see…
zebra.cfg (configuration file)
records (directory )
Note:
Records directory will be having a file called ‘sample-marc’
it contains MARC21 records in ISO-2709 format

Generating the Index
Run the following command
root@localhost# zebraidx update records
Note:*
Index files are created with the extension ‘.mf’
This is essential for search;Now you can start the z39.50 server











Starting Z39.50 Server

Run the following command
root@localhost# zebrasrv
You will see the following message
09:35:09-09/12 [log] zebra_start zebra.cfg 1.3.32
09:35:09-09/12 [server] Adding dynamic listener on tcp:@:9999 id=0
09:35:09-09/12 [server] Starting server zebrasrv pid=8856
1st Line: version number of the zebra server
2nd Line: zebra server is running on port 9999
3rd Line: Process id of program is 8856










And browse at
http://localhost:9999







Testing with MARCXML
root@localhost# cd test/marcxml
You will see marcxml files
m1.xml, m2.xml, m3.xml
Create a directory called ‘records’ and copy xml files into it
root@localhost cd records
cp *.xml records
Note:
In this case, each record is placed in a separate file, unlike the iso-2709 files where all the records are placed in one file

Testing (searching live!) from server to client
1) Activate the server
root@localhost# zebrasrv
If you want to update new marc records
(well, we choose marc protocol)
root@localhost# zebraidx update records ( records here are the directory that
contains new upload *.marc files)


2) Windows client
Install Mercury 39.50(easy and fast for windows installations)

Configuring Mercury client

Edit for target Zebra Server

Options /setting for Mercury Database

Search entries (Title/Authors)

Displaying /parsing Marc records


Here is the log file in the Zebra server
20:44:27-12/02 zebrasrv [log] dict_lookup_grep:(\x01\x0A)(kuwerdas)
20:44:27-12/02 zebrasrv [request] search defualt OK 3 1 1+3 RPN @attset Bib-1 @attr 1=4 Kwerdas


3) Linux Client
J
ust type yaz client service
root@localhost# yaz-client
Z>
Then...
Z> auth "username" "password"
Z> open ip_address:9999
Z> scan

Note:*
You will see that Zebra server is responding every time the client request a command session - so observe it!


Remarks:

Sine I was installing two flavour (Centos & Fedora) I would chat down issues with regards to their installation problems:example (Net::Z3950::ZOOM)

Fedora
1) Installing YAZ is easy
root@localhost# yum install libyaz libyaz-devel
2) If you encounter problem " could not find /usr/sbin/ld -lwrap" problem (its library wrapper)
root@localhost# yum install " *wrap* " (there you can install all the file..)

3)Another is the decryption /encryption for YAZ -if this problem occurs , here is the solution for that.
root@localhost# yum install "*crypt*"
...and that would be fine any more and forever more !

Centos
1) Installing YAZ is difficult
Well to resolve: download the YAZ file from its source, be sure that your rebuild it completely and then install, and if not and try this to be safe.
root@localhost#yum install openssl openssl-devel readline readline-devel libtool
root@localhost# yum install rpm-build
then after it's solved -then that could be fine for YAZ installations.
root@localhost# rpmbuild --rebuild yaz-2.1.54-1.el5.1.src.rpm

If encountered Zebra mapping error(*.o or *.c or *.h)!
It usually happen when your YAZ files is outdated compared to a new source of zebra that you have. So, you must run concurrent dependencies to its equivalent machine core.

Index Data don't have the resources to build YAZ and Zebra for all RPM systems out there. However developer provide spec files that usually work.. Here's the procedure for building YAZ and Zebra for your platform (Fedora Core 1X on i386).
1)Uninstall ALL existing YAZ components that you got from
kojipkg. etc..
root@localhost # rpm -ve libyaz3
And other commands if you have other components (say yaz).

Now the procedure is as follows:
root@localhost# wget http://ftp.indexdata.dk/pub/yaz/yaz-3.0.47.tar.gz
root@localhost# rpmbuild -ta yaz-3.0.47.tar.gz
Note:
Please goto "rpm-build installation" if you can't build the yaz-source.Then,

root@localhost# ls /usr/src/redhat/RPMS/i386
root@localhost# sudo rpm - vi /usr/src/redhat/RPMS/i386/*yaz*.rpm

You have now installed all YAZ components including libyaz3-devel which
is required for building Zebra. Now download the latest zebra with this:
(just assuming the file version)

root@localhost# wget http://ftp.indexdata.dk/pub/zebra/idzebra-2.0.40.tar.gz
root@localhost# rpmbuild -ta idzebra-2.0.40.tar.gz
root@localhost# rpm -vi /usr/src/redhat/RPMS/i386/*zebra*.rpm

If the installation don't work-- then please re inspect the proceeding /instructions above(YAZ & Zebra installations)-maybe you have missed one!

Conclusions:


No comments:

Post a Comment