--- /dev/null
+\documentclass{egee}
+\usepackage{comment}
+
+\def\LB{L\&B}
+
+\title{\LB\ Performance Test Plan}
+\author{CESNET EGEE JRA1 team}
+\DocIdentifier{EGEE-JRA1-??}
+\Date{\today}
+\Activity{JRA1: Middleware Engineering and Integration}
+\DocStatus{DRAFT}
+\Dissemination{PUBLIC}
+\DocumentLink{}
+
+%\def\req{\noindent\textbf{Prerequisities:}}
+%\def\how{\noindent\textbf{How to run:}}
+%\def\result{\noindent\textbf{Expected result:}}
+
+\def\path#1{{\normalfont\textsf{#1}}}
+\def\code#1{\texttt{#1}}
+\def\todo#1{\textbf{TODO:} #1}
+
+\begin{document}
+
+\input{frontmatter}
+\newpage
+\tableofcontents
+\newpage
+
+\section{Rationale}
+\todo{}
+
+\begin{verbatim}
+
+L&B Performance Testing
+=======================
+
+- all source modifications for tests are in CVS, conditionaly compiled
+ only with appropriate symbol
+
+- binaries for all tests are built using special property
+ for ant target (or environment variable for Makefile), which
+ compiles sources using the right #define combinations
+
+- component tests are run by shell scripts located under component
+ directories, these tests may require binaries from other components,
+ though
+
+- all tests use sequence of events for typical jobs (small job, big
+ job, small DAG, big DAG) prepared beforehand. These events are
+ stored in files in ULM format in CVS.
+
+- events are generated by stresslog program, which reads ULM text of
+ events for particular test job and logs the event sequence directly
+ by calling *_DoLogEvent<variant>. The number of test jobs is
+ configurable. Stresslog inserts into every event timestamp when the
+ event was generated and sent.*
+
+- event are consumed by breaking normal event processing either in the
+ component being tested or the next component in chain, that is
+ instrumented to read and discard events immediately. The consumption
+ itself is done by calling special function which takes current time,
+ extracts timestamp from event and prints the difference (ie. the
+ event processing time).* These "break points" are chosen to measure
+ throughput of the various component parts and to identify possible
+ bottlenecks within the components.
+
+ * the only exception is test of the logging library itself
+
+- test jobs are preregistered within the LB if the test includes
+ bookkeeping server and/or proxy by the test script program and
+ their id's are stored in separate file to enable re-use by other
+ load-generating tools (status queries, for example)
+
+- test results:
+ - some numbers must be reported by component themselves, not by
+ the event generator (due to the asynchronous LB nature). The
+ test script collects those numbers and presents them as the test
+ result at the end of testing.
+
+ - after completion test scripts print the table described for the
+ respective tests filled in with measured values (ie. the table
+ is not filled in manually by human tester)
+
+ - event throughput = 1/(time_delivered - time_arrived)
+ * only if next event is sent after previous was delivered
+
+? measure job throughput for event patterns of typical jobs or deduce
+job throughput from throughput of selected types of events?
+
+
+I) Component tests
+ ***************
+
+- tests of the isolated components on one node
+- may require binaries from other components to produce/consume events
+
+--------------------
+Logging library test
+--------------------
+
+* component:
+ org.glite.lb.client
+
+* binaries required:
+ logevent_libtest
+
+* test shell script:
+ perftest_loglib
+
+* input required:
+ - events
+
+* test description:
+ - measures time required to format given events into ULM. Events
+ are read from file, parsed into components, timestamped and
+ produced.
+
+ - events produced:
+ - by calling logging function edg_wll_LogEvent*()
+
+ - events consumed:
+ - discarded by logging function instead of sending via
+ appropriate protocol (LogEventMaster)
+
+* results:
+
+ job type (size) throughput (100k jobs)
+ -----------------------------------------
+ small job
+ big job
+ small DAG
+ big DAG
+
+
+
+----------------
+Locallogger test
+----------------
+
+* component:
+ org.glite.lb.logger
+
+* binaries required:
+ stresslog
+ glite_lb_logd_perf
+ glite_lb_logd_perf_nofile
+ - does not store events in file
+ glite_lb_interlogd_perf_empty
+ - consumes immediately after reading event
+
+* test shell script:
+ perftest_logd
+
+* input required:
+ - client and host certificates
+ - events
+
+* test description:
+ - measures time required for event to be sent from client to
+ local logger and processed by locallogger. Localloger is
+ either instructed (by option) or instrumented to skip some
+ parts of event processing:
+ a) no parse, no file, no ipc
+ glite_lb_logd_perf_nofile --noParse --noIPC
+ b) no file, no ipc
+ glite_lb_logd_perf_nofile --noIPC
+ c) no ipc
+ glite_lb_logd_perf --noIPC
+ d) normal operation
+ glite_lb_logd_perf
+
+ no parse - LL does not parse events
+ no file - LL does not store events into files
+ no ipc - LL does not send events through socket to IL
+
+ - events produced:
+ - stresslog sends events to logd using client->logd
+ protocol (*_DoLogEvent())
+
+ - events consumed:
+ i) after storing into files
+ ii) by "empty" IL
+
+* results:
+
+
+
+i) events stored in files
+
+ throughput: small big small big
+ job job DAG DAG
+ -------------------------------------------------
+ a)
+ b)
+ c)
+ d)
+
+ii) events sent to IL
+
+ throughput: small big small big
+ job job DAG DAG
+ -------------------------------------------------
+ a)
+ b)
+ c)
+ d)
+
+
+
+----------------
+Interlogger test
+----------------
+
+* component:
+ org.glite.lb.logger
+
+* binaries required:
+ stresslog
+ glite_lb_interlogd_perf
+ glite_lb_interlogd_perf_noparse
+ - does not parse events, server address is hardcoded
+ glite_lb_interlogd_perf_nosync
+ - does not call event_store_sync()
+ glite_lb_interlogd_perf_norecover
+ - recovery thread disabled
+ glite_lb_interlogd_perf_nosend
+ - events are consumed instead of sending
+ glite_lb_interlogd_perf_lazy
+ - lazy closing connection to bkserver
+ glite_lb_bkserverd_perf_empty
+ - consumes event immediately after receiving
+
+* test shell script:
+ perftest_interlogd
+
+* input required:
+ - host certificate
+ - events
+
+* test description:
+ - measures time the event travels through interlogger.
+ Interlogger is instrumented to skip some parts of eventh
+ processing for particular test, specifically tests include
+ these variants:
+ a) disabled event parsing. The server address
+ (eg. jobid) is hardcoded.
+ b) disabled event synchronization from files
+ c) disabled recovery thread
+ d) lazy bkserver connection close
+ e) normal operation
+
+ - events produced:
+ 1) stresslog sends events to interlogger using the unix
+ domain socket and logd->interlogger protocol, events are
+ stored in files (stresslog behaves like logd)
+ TODO: pro toto neni funkce v producerske knihovne
+ 2) interlogger reads events from event files created by
+ stresslog (by recovery thread)
+ 3) stresslog stores events to files and every n-th
+ (optional argument) is sent also through the unix socket
+
+ - events consumed:
+ i) discarded instead of being sent
+ ii) by "empty" bkserver
+
+* results:
+
+
+i) events discarded
+1) events received on socket
+(options 2 and 3 are not tested)
+
+ throughput: small big small big
+ job job DAG DAG
+ -------------------------------------------------
+ a)
+ b)
+ c)
+ e)
+
+
+ii) events sent to empty bkserver
+1) events received on socket
+
+ throughput: small big small big
+ job job DAG DAG
+ -------------------------------------------------
+ a)
+ b)
+ c)
+ d)
+ e)
+
+
+2) events recovered from files
+
+ throughput: small big small big
+ job job DAG DAG
+ -------------------------------------------------
+ d)
+ e)
+
+
+3) events synced from files, every 10th event sent on socket
+
+ throughput: small big small big
+ job job DAG DAG
+ -------------------------------------------------
+ a)
+ b)
+ c)
+ d)
+ e)
+
+
+------------
+LBProxy test
+------------
+
+* component:
+ org.glite.lb.proxy
+
+* binaries required:
+ stresslog
+ glite_lb_proxy_perf_noparse
+ - consumes events before parsing
+ glite_lb_proxy_perf_nostore
+ - consumes events before storing into database
+ glite_lb_proxy_perf_nostate
+ - consumes events before computing job status
+ glite_lb_proxy_perf_nosend
+ - consumes events before sending to interlogger
+ glite_lb_interlogd_perf_empty
+ - consumes immediately after reading event
+
+* test shell script:
+ perftest_proxy
+
+* input required:
+ - events
+
+* test description:
+ - measures time required for processing event by LB proxy. Test
+ is performed with (a)) and without (b)) checking for duplicit
+ events.
+
+ - events produced:
+ - stresslog sends events using the IL protokol on local
+ socket (using DoLogEventProxy())
+
+ - events consumed:
+ i) before parsing
+ ii) before storing into database
+ iii) after storing into database
+ iv) after job status computation
+ v) normal operation
+
+
+
+
+* results:
+
+a) with duplicity check:
+
+ throughput: small big small big
+ job job DAG DAG
+ -------------------------------------------------
+ i)
+ ii)
+ iii)
+ iv)
+ v)
+
+
+b) without duplicity check:
+
+ throughput: small big small big
+ job job DAG DAG
+ -------------------------------------------------
+ i)
+ ii)
+ iii)
+ iv)
+ v)
+
+
+--------------
+LB server test
+--------------
+
+* component:
+ org.glite.lb.server
+
+* binaries required:
+ stresslog
+ glite_lb_server_perf_noparse
+ - consumes events before parsing
+ glite_lb_server_perf_nostore
+ - consumes events before storing into database
+ glite_lb_server_perf_nostate
+ - consumes events before computing job status
+
+* test shell script:
+ perftest_server
+
+* input required:
+ - host certificates
+ - events
+
+* test description:
+ - measures time required for processing event by LB server. Test
+ is performed with (a)) and without (b)) checking for duplicit
+ events.
+
+ - events produced:
+ - stresslog sends events using the IL protokol (using DoLogEventDirect())
+
+ - events consumed:
+ i) before parsing
+ ii) before storing into database
+ iii) after storing into database
+ iv) normal operation
+
+* results:
+
+a) with duplicity check:
+
+ throughput: small big small big
+ job job DAG DAG
+ -------------------------------------------------
+ i)
+ ii)
+ iii)
+ iv)
+
+
+b) without duplicity check:
+
+ throughput: small big small big
+ job job DAG DAG
+ -------------------------------------------------
+ i)
+ ii)
+ iii)
+ iv)
+
+
+
+---------------------
+Job registration test
+---------------------
+
+* component:
+ org.glite.lb.server
+ org.glite.lb.proxy
+
+* binaries required:
+ stressreg
+ - generates registration events
+ glite_lb_bkserverd
+ glite_lb_proxy
+ glite_lb_bkserverd_perf_empty
+ glite_lb_proxy_perf_empty
+
+* test shell script:
+ perftest_jobreg
+
+* input required:
+ - host & user certificates
+
+* test description:
+ - measures time required to register given number of jobs (time
+ to process registration event). The registration event is
+ synchronous in principle, so it is possible to get results just
+ from the client (stressreg). Test variants include:
+ a) current implementation
+ b) implementation of connection pool at the client
+ c) parallel communication with server and proxy
+
+
+ - events produced:
+ - stressreg sends registration events by calling
+ edg_wll_RegisterJob*()
+
+ - events consumed:
+ i) normally processed by server & proxy
+ ii) server replies immediate success
+ iii) proxy replies immediate success
+
+* results:
+
+a) current implementation
+
+ throughput: one DAG DAG DAG
+ job (1000 nodes) (5000 nodes) (10000 nodes)
+ -----------------------------------------------------------------
+ i)
+ ii)
+ iii)
+
+
+b) connection pool
+
+ throughput: one DAG DAG DAG
+ job (1000 nodes) (5000 nodes) (10000 nodes)
+ -----------------------------------------------------------------
+ i)
+ ii)
+ iii)
+
+
+c) parallel communication
+
+ throughput: one DAG DAG DAG
+ job (1000 nodes) (5000 nodes) (10000 nodes)
+ -----------------------------------------------------------------
+ i)
+
+
+
+\end{verbatim}
+
+\end{document}
\ No newline at end of file