--- /dev/null
+<!DOCTYPE refentry PUBLIC "-//OASIS//DTD DocBook V4.1//EN">
+
+<refentry id='glitelbharvester'>
+
+ <refmeta>
+ <refentrytitle>glite-lb-harvester</refentrytitle>
+ <manvolnum>1</manvolnum>
+ <refmiscinfo>EU EGEE Project</refmiscinfo>
+ </refmeta>
+
+ <refnamediv id='name'>
+ <refname>glite-lb-harvester</refname>
+ <refpurpose>daemon for processing L&B notifications</refpurpose>
+ </refnamediv>
+
+ <refsynopsisdiv id='synopsis'>
+ <cmdsynopsis>
+ <command>glite-lb-harvester</command>
+
+ <arg><group choice='plain'>
+ <arg>-h</arg>
+ <arg>--help</arg>
+ </group></arg>
+
+ <arg><group choice='plain'>
+ <arg>-v</arg>
+ <arg>--version</arg>
+ </group></arg>
+
+ <arg><group choice='plain'>
+ <arg>-d</arg>
+ <arg>--debug</arg>
+ </group> <replaceable>LEVEL</replaceable></arg>
+
+ <arg><group choice='plain'>
+ <arg>-D</arg>
+ <arg>--daemon</arg>
+ </group></arg>
+
+ <arg><group choice='plain'>
+ <arg>-i</arg>
+ <arg>--pidfile</arg>
+ </group> <replaceable>PIDFILE</replaceable></arg>
+
+ <arg><group choice='plain'>
+ <arg>-s</arg>
+ <arg>--threads</arg>
+ </group> <replaceable>N</replaceable></arg>
+
+ <arg><group choice='plain'>
+ <arg>-t</arg>
+ <arg>--ttl</arg>
+ </group> <replaceable>TIME</replaceable></arg>
+
+ <arg><group choice='plain'>
+ <arg>-H</arg>
+ <arg>--history</arg>
+ </group> <replaceable>TIME</replaceable></arg>
+
+ <arg><group choice='plain'>
+ <arg>-c</arg>
+ <arg>--config</arg>
+ </group></arg>
+
+ <arg><group choice='plain'>
+ <arg>-m</arg>
+ <arg>--pg</arg>
+ </group> <replaceable>USER/PWD@SERVER:DBNAME</replaceable></arg>
+
+ <arg><group choice='plain'>
+ <arg>-n</arg>
+ <arg>--notifs</arg>
+ </group> <replaceable>FILE</replaceable></arg>
+
+ <arg><group choice='plain'>
+ <arg>-p</arg>
+ <arg>--port</arg>
+ </group> <replaceable>PORT</replaceable></arg>
+
+ <arg><group choice='plain'>
+ <arg>-C</arg>
+ <arg>--cert</arg>
+ </group> <replaceable>FILE</replaceable></arg>
+
+ <arg><group choice='plain'>
+ <arg>-K</arg>
+ <arg>--key</arg>
+ </group> <replaceable>FILE</replaceable></arg>
+
+ <arg><group choice='plain'>
+ <arg>-o</arg>
+ <arg>--old</arg>
+ </group></arg>
+
+ <arg><group choice='plain'>
+ <arg>-l</arg>
+ <arg>--cleanup</arg>
+ </group></arg>
+
+ <arg><group choice='plain'>
+ <arg>-w</arg>
+ <arg>--wlcg</arg>
+ </group></arg>
+
+ <arg><group choice='plain'>
+ <arg>--wlcg-binary</arg>
+ </group> <replaceable>EXECUTABLE</replaceable></arg>
+
+ <arg><group choice='plain'>
+ <arg>--wlcg-topic</arg>
+ </group> <replaceable>TOPIC</replaceable></arg>
+
+ <arg><group choice='plain'>
+ <arg>--wlcg-config</arg>
+ </group> <replaceable>FILENAME</replaceable></arg>
+
+ <arg><group choice='plain'>
+ <arg>--wlcg-flush</arg>
+ </group></arg>
+ </cmdsynopsis>
+ </refsynopsisdiv>
+
+ <refsect1>
+ <title>DESCRIPTION</title>
+ <para>
+L&B Harvester gathers information about jobs from L&B servers using effective
+L&B notification mechanism. It manages notifications and keeps them in
+a persistent storage (file or database table) to reuse later on next launch.
+It takes care about refreshing notifications and queries L&B servers back when
+some notification on expires.
+
+The tool was initially written for Real Time Monitor (project at Imperial
+College in London), later was extended with MSG publish messaging mechanism for WLCG.
+ </para>
+ </refsect1>
+
+ <refsect1>
+ <title>Requirements</title>
+ <para>
+It is required on L&B servers side:
+ <itemizedlist>
+ <listitem><para>
+<filename>lastUpdateTime</filename> index, see "Changing Index Configuration" section in L&B Admin Guide
+ </para></listitem>
+ <listitem><para>
+L&B harvester identity (certification subject) in super users file
+ </para></listitem>
+ </itemizedlist>
+ </para>
+ </refsect1>
+
+ <refsect1>
+ <title>OPTIONS</title>
+
+ <variablelist>
+ <varlistentry>
+ <term><option>-h</option>|<option>--help</option></term>
+ <listitem><para>
+Print short usage.
+ </para></listitem>
+ </varlistentry>
+ </variablelist>
+
+ <variablelist>
+ <varlistentry>
+ <term><option>-v</option>|<option>--version</option></term>
+ <listitem><para>
+Print harvester version identifier.
+ </para></listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><option>-d</option>|<option>--debug</option></term>
+ <listitem><para>
+ Verbosity level:
+ <variablelist>
+ <varlistentry><term>0</term><listitem><para>error only</para></listitem></varlistentry>
+ <varlistentry><term>1</term><listitem><para>warnings</para></listitem></varlistentry>
+ <varlistentry><term>2</term><listitem><para>info/progress</para></listitem></varlistentry>
+ <varlistentry><term>3</term><listitem><para>debug</para></listitem></varlistentry>
+ <varlistentry><term>4</term><listitem><para>insane</para></listitem></varlistentry>
+ <varlistentry><term>+8 (8,9,10,11,12)</term><listitem><para>don't fork and no preventive restarts</para></listitem></varlistentry>
+ </variablelist>
+ </para></listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><option>-D</option>|<option>--daemonize</option></term>
+ <listitem><para>
+Daemonize and detach from console. Error messages are directed to syslog.
+ </para></listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><option>-i</option>|<option>--pidfile</option></term>
+ <listitem><para>
+The file with process ID. Automatically removed on shutdown.
+ </para></listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><option>-s</option>|<option>--threads</option></term>
+ <listitem><para>
+Number of threads (slaves). Configured L&B servers are equally distributed between threads.
+ </para></listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><option>-t</option>|<option>--ttl</option></term>
+ <listitem><para>
+Validity (time to live) of the notifications. Daemon regularly refreshes notification in advance as needed.
+ </para></listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><option>-H</option>|<option>--history</option></term>
+ <listitem><para>
+Historic dive limit in seconds. <= means unlimited.
+ </para><para>
+ When staring, the L&B harvester queries the L&B servers for existing jobs. It queries L&B server when notification expires too and can't be refreshed on time. This parameter is used for limit, how deep into history L&B harvester should go.
+ </para><para>
+ Another usage of this parameter is for derivation of the maximal time of retries. When some L&B server is inaccessible or it is in error condition, harvester linearly increases retry time. The maximal retry time is half of this parameter.
+ </para></listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><option>-c</option>|<option>--config</option></term>
+ <listitem><para>
+ Config file name with list of L&B servers. When used together with database option <option>-m</option> (<option>--pg</option>), this parameter has precedence before <filename>lb20</filename> table.
+ </para></listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><option>-m</option>|<option>--pg</option></term>
+ <listitem><para>
+Database connection string in the <filename>USER/PWD@SERVER:DBNAME</filename> form. There are used following tables in database:
+ <itemizedlist>
+ <listitem><para>
+ <filename>lb20</filename> - the list of L&B servers is taken from this table. But when is specified option <option>-c</option> (<option>--config</option>) too, the file has precedence before this table.
+ </para><para>
+There is kept a column <filename>monitored</filename> in too: if there is any inactive notification because of errors on given L&B server (one expired or it was unable to create a new one), the <filename>false</filename> value is set. After refreshing or creating the notification, the value is set back to <filename>true</filename>.
+ </para></listitem>
+ <listitem><para>
+ <filename>jobs</filename> - table for storing job states. Each record is updated for each incoming notification - when state of the job changes in L&B server.
+ </para></listitem>
+ </itemizedlist></para><para>
+Database schema can be found in source code of <filename>org.glite.lb.harvester</filename>: <filename>examples/test.sql</filename>
+ </para><para>
+Developer note: information about notifications are kept in a file. It is possible to compile a binary keeping states in the database. It is used in the test in <filename>examples</filename> source directory.
+ </para></listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><option>-n</option>|<option>--notifs</option></term>
+ <listitem><para>
+File for internal usage in L&B harvester. There is kept persistent information about active notifications or errors on L&B servers. Default is <filename>/var/tmp/notifs.txt</filename>.
+ </para></listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><option>-p</option>|<option>--port</option></term>
+ <listitem><para>
+Specifies the port for listening and requests L&B nodes to send notification messages only to this port. May be needed for networks behind NAT or behind firewalls.
+ </para></listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><option>-C</option>|<option>--cert</option></term>
+ <listitem><para>
+X509 certificate file.
+ </para></listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><option>-K</option>|<option>--key</option></term>
+ <listitem><para>
+X509 key file.
+ </para></listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><option>-o</option>|<option>--old</option></term>
+ <listitem><para>
+"silly" mode for L&B servers < 2.0. In this mode transfer of the notification is not optimized at all. On the other hand it will work with older L&B servers.
+ </para></listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><option>-l</option>|<option>--cleanup</option></term>
+ <listitem><para>
+Cleans up all active notifications and quits.
+ </para><para>
+Each notification automatically expires. But if you know, than notifications used in previous run of L&B harvester won;t be needed, it is recommended to clean up the notifications and spare the resources on L&B servers (queue with undelivered notification messages and matching rules).
+ </para></listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><option>-w</option>|<option>--wlcg</option></term>
+ <listitem><para>
+Enables delivery to MSG publish. Messages are sent by executing a binary with proper parameters.
+ </para></listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><option>--wlcg-binary</option></term>
+ <listitem><para>
+Full path to msg-publish binary executable, which is called for sending messages. Default is <filename>/usr/bin/msg-publish</filename>.
+ </para></listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><option>--wlcg-topic</option></term>
+ <listitem><para>
+Topic used in MSG publish messages. Default is <filename>org.wlcg.usage.jobStatus</filename>.
+ </para></listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><option>--wlcg-config</option></term>
+ <listitem><para>
+Config file used in MSG publish. Default is <filename>/etc/msg-publish/msg-publish.conf</filename>.
+ </para></listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><option>--wlcg-flush</option></term>
+ <listitem><para>
+Messages are sent to MSG publish in batches by default. This option enforce sending the messages one by one on each notification from L&B server - for each job state change.
+ </para></listitem>
+ </varlistentry>
+
+ </variablelist>
+ </refsect1>
+
+ <refsect1>
+ <title>ENVIRONMENT</title>
+
+ <variablelist>
+ <varlistentry>
+ <term>GLITE_LB_HARVESTER_NO_REMOVE</term>
+ <listitem><para>
+<filename>0</filename> or <filename>false</filename> instructs L&B harvester to not remove temporary files with sent messages for MSG publish. By default temporary files with successfully sent messages are removed. Files with failed messages are always preserved.
+ </para><para>
+Intended for debugging purposes.
+ </para></listitem>
+ </varlistentry>
+ </variablelist>
+ </refsect1>
+
+ <refsect1>
+ <title>EXAMPLES</title>
+
+ <refsect2>
+ <title>MSG publish infrastructure</title>
+ <para>
+Harvester will send notifications using msg-publish infrastructure. List of the L&B servers to harvest is specified in config file specified by <option>-c</option> option:
+ </para>
+ <variablelist>
+ <varlistentry>
+ <term><command>glite-lb-harvester -c servers.txt -C certfile -K keyfile --wlcg</command></term>
+ <listitem><para>
+With newer L&B servers >= 2.0.
+ </para></listitem>
+ </varlistentry>
+ <varlistentry>
+ <term><command>glite-lb-harvester -c servers.txt -C certfile -K keyfile --wlcg --old</command></term>
+ <listitem><para>
+With older L&B servers < 2.0 (backward compatible but greedy notifications).
+ </para></listitem>
+ </varlistentry>
+ </variablelist>
+
+ <para>
+Custom configuration examples for MSG publish:
+ <itemizedlist>
+ <listitem><para>
+<option>--wlcg-binary</option> <filename>$HOME/bin/msg-publish</filename>
+ </para></listitem><listitem><para>
+<option>--wlcg-topic</option> <filename>org.wlcg.usage.JobStatus2</filename>
+ </para></listitem><listitem><para>
+<option>--wlcg-config</option> <filename>$HOME/etc/msg-publish.conf.wlcg</filename>
+ </para></listitem>
+ </itemizedlist>
+ </para>
+ </refsect2>
+
+ <refsect2>
+ <title>Real Time Monitor</title>
+ <para>
+Harvester will use postgres database. Table <filename>lb20</filename> with L&B servers to harvest (read-only), table <filename>jobs</filename> for result job states (read/write):
+ </para>
+ <variablelist>
+ <varlistentry>
+ <term><command>glite-lb-harvester -C certfile -K keyfile --pg rtm/@:rtm -p 9004</command></term>
+ <listitem><para>
+In this case the L&B harvester will connect to database <filename>rtm</filename> on <filename>localhost</filename> as user <filename>rtm</filename>. For incoming notification it will request and listen only on port 9004.
+ </para></listitem>
+ </varlistentry>
+ </variablelist>
+ </refsect2>
+
+ <refsect2>
+ <title>Other recommended options</title>
+ <para>
+Use <command>glite-lb-harvester --help</command> for the whole summary.
+ </para><para>
+For example:
+ <variablelist>
+ <varlistentry>
+ <term><option>--daemonize --pidfile /var/run/glite-lb-harvester.pid</option></term>
+ <listitem><para>
+Daemonizing and using syslog.
+ </para></listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><option>-d 2</option></term>
+ <listitem><para>
+Decreasing verbosity (2 for errors and warnings only).
+ </para></listitem>
+ </varlistentry>
+ </variablelist>
+ </para>
+ </refsect2>
+ </refsect1>
+
+ <refsect1>
+ <title>EXIT</title>
+ <para>
+In non-daemon mode CTRL-C can be used.
+ </para><para>
+Use the pidfile in daemon mode (pidfile will vanish after exit):
+ </para><para>
+<command>kill `cat /var/run/glite-lb-harvester.pid`</command>
+ </para><para>
+All notifications are preserved on LB servers, and will expire later. You can
+purge them at once, if they won't be needed:
+ </para><para>
+<command>glite-lb-harvester --cleanup</command>
+ </para>
+ </refsect1>
+
+ <refsect1>
+ <title>EXIT STATUS</title>
+ <variablelist>
+ <varlistentry>
+ <term>0</term>
+ <listitem><para>Success.</para></listitem>
+ </varlistentry>
+ <varlistentry>
+ <term>1</term>
+ <listitem><para>Reloading, used only internally for preventive restarts.</para></listitem>
+ </varlistentry>
+ <varlistentry>
+ <term>2</term>
+ <listitem><para>Error occurred. Messages go on console (foreground run) or into syslog (daemon run), depending on verbosity.</para></listitem>
+ </varlistentry>
+ </variablelist>
+ </refsect1>
+
+ <refsect1>
+ <title>AUTHOR</title>
+ <para>gLite L&B product team, CESNET.</para>
+ </refsect1>
+
+</refentry>