VHIST Users' Guide
Stefan Vollmar, Andreas Hüsgen, Michael Sué,
Joachim Nock, Roman Krais
Email: vhist@nf.mpg.de
VHIST 1.60.0, Jan 26 2010, Rev 419:1758
Max-Planck-Institut für neurologische Forschung mit Klaus-Joachim-Zülch-Laboratorien der Max-Planck-Gesellschaft und der Medizinischen Fakultät der Universität zu Köln Cologne, Germany http://www.nf.mpg.de |
Table of Contents
- 1 Introduction
- 2 vhistadd
- 2.1 Arg-Files
- 2.2 Synopsis
- 2.3 Built-in help
- 2.4 Get Version Information
- 2.5 Use an Arg-File
- 2.6 Sync Current Working Directory
- 2.7 Output VHIST file
- 2.8 Append to VHIST file
- 2.9 VHIST root file
- 2.10 Pre-defined attributes (workflow step)
- 2.11 User-defined attributes (workflow step)
- 2.12 PDF-related properties
- 2.13 Input file(s) with qualifiers
- 2.14 Output file(s) with qualifiers.
- 2.15 Pre-defined flags (files)
- 2.16 Pre-defined attributes (files)
- 2.17 User-defined attributes (files)
- 2.18 Verbosity
- 2.19 Pretend Mode
- 2.20 Customization: PDF first page
- 2.21 Customization: embedded readme
- 2.22 vhistadd examples
- 3 vhistxs
- 4 vhistxl
- 5 vhistzard
1 Introduction
1.1 About VHIST
The VHIST [1], [2] project defines a file format specification that allows to embed arbitrary binary data for the documentation of workflows together with structured meta-information and multiple facilities for validation. The format conforms to PDF and other open standards, is self-describing and particulary suited as an image or meta-image format in the context of multi-modality and functional imaging.
It includes a platform independent reference implementation which contains the essential features. VHIST can be used on top of existing workflows without the need to change major applications. Please refer to the VHIST white paper [1] for more information on the general concept and the specification, this User's Guide focuses on how to use the reference implementation.
Please see Disclaimer and Licensing for legal issues.
1.2 VHIST Reference Implementation
The reference implementation currently consists of a set of commandline tools and a user-friendly suite of GUI tools. It can be downloaded at the VHIST homepage [1].
The commandline tools (VHIST core):
- vhistadd - commandline tool for creating VHIST files.
- vhistxs - commandline tool for extracting embedded data from VHIST files. Minimum implementation.
- vhistxl - commandline tool for extracting embedded data from VHIST files, including facilities of individual sections, extraction of individual embedded files with validation.
The GUI tools:
- vhistzard - an application that helps configuring commandline options for vhistadd and creating VHIST files without using the commandline.
1.3 Hard- and Software Requirements
Platform independence was an important goal when conceiving VHIST and its Reference Implementation We believe that VHIST Core should work well on a wide range of systems for which a Python distribution [3] (equal or better to version 2.4) exists, especially on anything manufactured in this millenium.
We use Trolltech's Qt libraries [4] for the user friendly VHISTzard. Some distributions of VHIST ship with the appropiate libraries and should run out of the box.
- MS Windows
-
A standard installer is available that can deploy a
ready-to-run version of
VHISTzard
including all commandline tools. This setup does not require a Python distribution, as all commandline tools have been compiled to ready-to-use stand-alone ".exe" programs. Alternatively, you could install Python (using a proper "single-click" installer [3]) and use the "source" distribution of VHIST. - Mac OS X
-
a ready-to-run version of
VHISTzard
is available for this platform, it can be installed as a standard Mac application. Installing the source distribution is also possible, a suitable Python distribution is part of the operating system since Mac OS X 10.3 (for older systems it can be easily installed without compromising other applications that might rely on the older versions). VHIST should work just fine with the newer "Intel" Macs (tested with Tiger and Leopard) and for the older PPC platform. - Linux
-
Python should be ubiquitous on this platform. Use
python -v
to check whether your installed version is recent enough (it very likely is), anything equal or better version 2.4 should work. If the installed version is not recent enough, you should consider upgrading it. Please refer to the documentation of your distributionIf for some reason Python is not installed on your system, please refer to the documentation for your linux distribution on how to install Python.
- Solaris
- Python is shipped with Solaris 10 and available for previous versions of this operating system. VHIST should work on SPARC systems, as well as Solaris machines with x86 architecture.
1.4 VHIST terminology
The general idea behind VHIST is to provide a robust and simple means for documenting steps of a workflow by logging and optionally embedding all relevant information: which files were used, which files were written, what software package was used in which version and with what parameters.
An example from medical imaging is the process (workflow) to create
an image volume suitable for scientific/diagnostics purposes. In this
case, a typical workflow step could be the the application of an
image filter (tool, this example works best with a commandline tool)
to an image, present as an input file (infile). The commandline used
in this workflow step, in addition to some filter option (say, Gauss
filter scope), could be added to the VHIST file using
Attributes. Assuming the filter application writes a new file
containing the filtered image volume, we can add this as an output
file (outfile). It is recommended practice to have vhistadd
embed (sse Files: Embedded Vs. Reported) any logfile the filter
application might write.
1.5 Attributes
An attribute is a key-value pair. vhistadd
provides commandline
options for specifying either pre-defined or user-defined keys. We
make this distinction to ensure that a basic set of attributes is
universally available (e.g. description
, or title
). This is a
prerequisite for comparing VHIST files from different
sources. Attributes can be specified for the workflow step and even
for individual in- and outfiles. The type and meaning of the value
depends on the key is described in the documentation of the individual
options.
1.6 Appending to a VHIST file
VHIST
files are stacks of sections which can be validated
independently and usually refer to one workflow step. Sections are
appended at the end of an existing file, so no previous data is
changed. This is related to incremental writing of PDF [5] files. If
you want to add on to a VHIST
file containing information about
previous workflow steps (this is recommended) you need to specify a
VHIST root file (this can be any VHIST file).
1.7 Files: Embedded Vs. Reported
The vhistadd
implementation uses the terms reported and
embedded, meaning similar but slightly different things.
A reported file is a file whose file properties are reported in the automatically generated XML summary [1] (which contains structured information on one workflow step and is suitable for automated processing) and in the corresponding "human-readable" PDF part of the VHIST document.
An embedded file's properties are also listed in the XML summary and the PDF listing. However, in this case the contents of the file are also completely contained in the VHIST document, either in compressed or uncompressed form.
Embedded files can be extracted by various means, e.g. with a PDF browser, VHISTzard, vhistxs, vhistxl. You can embed binary data in VHIST files.
1.8 Disclaimer
THIS SOFTWARE IS PROVIDED AS-IS, WITHOUT ANY EXPRESSED OR IMPLIED WARRANTY. SPECIFICALLY, NEITHER THE MAX-PLANCK-INSTITUT FÜR NEUROLOGISCHE FORSCHUNG MIT KLAUS-JOACHIM-ZÜLCH LABORATORIEN DER MAX-PLANCK-GESELLSCHAFT UND DER MEDIZINISCHEN FAKULTÄT ZU KÖLN NOR THE AUTHORS WARRANT THAT THE FUNCTIONS CONTAINED IN THE SOFTWARE WILL MEET YOUR REQUIRMENTS, OR THAT THE OPERATION OF THE SOFTWARE WILL BE UNINTERRUPTED OR ERROR-FREE, OR THAT DEFECTS IN THE SOFTWARE WILL BE CORRECTED. TO THE EXTENT PERMITTED BY LAW, NEITHER THE MPI FÜR NEUROLOGISCHE FORSCHUNG NOR THE AUTHORS SHALL BE LIABLE FOR ANY DAMAGES ARISING OUT OF OR RELATING TO THE USE OF THE SOFTWARE, INCLUDING BUT NOT LIMITED TO INCIDENTAL, SPECIAL, INDIRECT OR CONSEQUENTIAL DAMAGES OR ANY LOST PROFITS, BUSINESS INTERRUPTION, LOSS OF PROGRAMS OR OTHER DATA ON YOUR INFORMATION HANDLING SYSTEM.
1.9 Licensing
FOR NON-COMMERCIAL SCIENTIFIC RESEARCH USAGE ONLY, the VHIST spcification [1] and the VHIST Reference Implementation are available under the GNU General Public License (GPL Version 3) [6]. Any other usage requires written permission by the Max Planck Institute for Neurological Research Cologne.
2 vhistadd
vhistadd
is the command-line tool used for creating or appending to
VHIST
files.
We think that vhistadd
is a comparatively easy to use - although the
number of commandline options featured by vhistadd can be a bit
intimidating. Therefore we developed VHISTzard, a user-friendly tool
specifically designed to assist you with assembling and running
vhistadd
commandlines and which can be used without a detailed
knowledge of vhistadd
's options.
In order to show a glimpse of what vhistadd
can do for you, we have
added some examples.
2.1 Arg-Files
You can write the arguments of a vhistadd
call into a file and then
specify this file instead of typing the whole commandline into the
shell. These files are called Arg-Files.
Orginally, this feature was implemented due to limitations of the MS Windows platform, where only commandline arguments of fairly moderate size can be passed to programs (2048 characters on Windows 2000, 8191 on Windows XP [7], typical UNIX commandlines can be >100KB long). The corresponding error message does not make this obvious: "Window cannot access the specified device, path, or file. You may not have the appropriate permissions to access the item."
Since 2048 characters is not enough for all but the simplest use
cases, a more powerful mechanism was needed. Arg-Files have several
additional advantages to normal commandline calls: Arg-Files can
easily be archived, copied around and passed to other uers. They can
even be embedded into the VHIST
file they generate, the lines in an
Arg-File can be of arbitrary length and the syntax for Arg-Files
is independent of platform, operating system of shell. It is very
similar to the syntax of the bash [8] shell: single arguments are
separated by one or several whitespace characters (space, tab or
newline), strings which contain whitespaces must be enclosed in quotes
(") and enviroment variables start with a $
sign,
e.g. $PATH
. Comments start with a hash symbol #
and continue until
the end of the line.
2.2 Synopsis
This table contains a brief summary of the commandline options for vhistadd.py.
Function | Commandline Option |
---|---|
Built-in help | -h or --help |
Get Version | -V or --version |
Use an Arg File | -c or --cmdfile <cmdfile> |
Sync current directory | --sync-cwd |
Output VHIST file | -O <filename1> [-O <filename2> ...] [other options] |
Append to VHIST file | -A <filename> |
VHIST root file | -I <filename> or vhistadd.py -J <filename> |
Pre-defined attributes (workflow step) | -s <key> <value> [-s <key> <value>] ... |
User-defined attributes (workflow step) | -U <key> <value> [-U <key> <value>] ... |
PDF-related properties | -d <key> <value> [-d <key> <value>] ... |
Input file(s) with qualifiers | -i <filename> ... <qualifiers> [-i <filename> ... <qualifiers>] |
Output file(s) with qualifiers | -o <filename> ... <qualifiers> [-o <filename> ... <qualifiers>] |
Customization: PDF first page | -1 or --firstpage |
Customization: embedded readme | -r or --readme |
Verbosity | -v or -q |
Pretend Mode | -p |
This table contains the qualifiers that can be attachted to input or output files.
Type of File | Qualifier |
---|---|
Pre-defined flags (files) | [-f <flag>] [-f <flag>] ... |
Pre-defined attributes (files) | -a <key> <value> [-a <key> <value>] ... |
User-defined attributes (files) | -u <key> <value> [-u <key> <value>] ... |
2.5 Use an Arg-File
-c <cmdfile>
or
--cmdfile <cmdfile>
will read a file <cmdfile>.
An Arg-File is a file containing the commandline of one vhistadd
call as descriped in Arg-Files.
2.6 Sync Current Working Directory
It is possible to synchronize the current working directory used by vhistadd
with the path of the selected Arg-File. If synchronizing is enabled, all paths
(with the exception of readme- and firstpage files) specified in the commandline
or any Arg-File are relative to the path of the first Arg-File.
2.7 Output VHIST file
-O <filename> [-O <filename>] [-O <filename>] ...
The name of the VHIST
file which will be generated during this workflow
step. If more than one file is specified, several VHIST
files with identical
content will be generated. At least one -O
(capital letter "O") option must
be specified.
By design, vhistadd prevents you form overwriting existing VHIST
files except
(a) when an output VHIST file is also specified as a VHIST root file or
(b) you choose the Append to VHIST file option.
2.8 Append to VHIST file
-A <filename>
If the file exists, the current workflow step will be appended to the file,
preserving all previous content. If not, a new VHIST
file of that name
will be created.
2.9 VHIST root file
-I <filename>
(file must exist)
-J <filename>
(ignore if file cannot be read)
The name of the VHIST
file, which is used as the basis for the new
VHIST
file. The generated workflow step is appended to a copy of this
VHIST
document. The VHIST
root file is not modified except if the root
file is also specified as an output VHIST
file. If this option is not
specified, vhistadd starts with an empty document. The -I
option will
cause vhistadd
to stop with an error message if the VHIST root file
cannot be read, use -J
if vhistadd should ignore a missing root file
(we have introduced -A
, see previous subsection, to achieve the same
effect without having to specify an output VHIST file with -O
).
2.10 Pre-defined attributes (workflow step)
-s <key> <value>
We distinguish between User-defined attributes (workflow step) and pre-defined properties or attributes of a workflow step. Use to set one or more of the following pre-defined attributes (it is an error to use other keywords with this option):
Attribute | Brief Description |
---|---|
title | The title of the workflowstep. |
description | A short description of the workflowstep. Longer descriptions can be embedded as input files. |
comment | Additional comments referring to this workflowstep. |
tool | A version string of the tool used in the workflowstep. |
toolversion | An alias for tool. |
toolpath | The path to the executable of the tool. |
host | The name of the host, on which the workflowstep was performed. |
user | The name of the user, who performed the workflowstep. |
command | The commandline used to exectue the workflowstep. |
2.11 User-defined attributes (workflow step)
-U <key> <value>
to set a user-defined attribute.
We distinguish between user-defined and Pre-defined attributes (workflow step).
-U "my key" "my value"
would create the
attribute
my key
and set it to
my value
.
2.12 PDF-related properties
-d <key> <value>
or
-doc <key> value>
Valid keys are: producer, creator, author, keywords, title, subject
These properties are important when viewing VHIST-files using a PDF browser. We refer to [5] for a detailed description.
These properties can currentyl only be set for newly created VHIST files. When appending a workflowstep to an existing VHIST file, an attempt to set PDF-related properties is ignored.
2.13 Input file(s) with qualifiers
-i <filename>
or
--infile <filename>
specifies one or more input files.
The VHIST format distinguishes between the files present before the workflow step (infiles) and files which result from executing the specified tool (outfiles).
See Pre-defined flags (files) Pre-defined attributes (files), User-defined attributes (files) for further options.
2.14 Output file(s) with qualifiers.
-o <filename>
or
--outfile <filename>
to specify one or more output files.
The VHIST format distinguishes between files present before the workflow step (infiles) and files which result from executing the specified tool (outfiles).
See Pre-defined flags (files) Pre-defined attributes (files), User-defined attributes (files) for further options.
2.15 Pre-defined flags (files)
-r <key>
sets one or more pre-defined flags.
It is an error to use other keywords then pre-defined flags with
this option.
A flag enabling or disabling a property of the previously specified in- or outfile.
The flags can be negated by prepending the flag's name with no-
, e.g. no-embed
.
Flag | Brief description |
---|---|
automd5 | The MD5 sum [9] of the file's content is automatically calculated by vhistadd . If the file is embedded, this option is forced. By default, this option is enabled. |
embed | The file's content is embedded into the generated VHIST file. By default, this option is enabled. |
compress | The file is compressed using the flate compression method [10]. If the file's content is not embedded, this flag is ignored. By default, this option is enabled. |
optional | The file is optional and only included in the VHIST file if it is readable, otherwise it is silently ignored. By default, this flag is not set, i.e. if vhistadd needs to access the file to compute the MD5 sum or read the last modification date but the the file is not available, vhistadd will abort with an error message. If this flag is not set, but either flags embed or automd5 are set and the file is not readable, vhistadd will also abort with an error message. |
previewws | Hint that a VHIST browser with a suitable template can use this file as a preview image for the whole workflow step. Currenty must be a JPEG or PNG image. |
preview | Hint that a VHIST browser with a suitable template can use this file as a preview image for the next embedded file. Currenty must be a JPEG or PNG image. Tip: define a "description" attribute to ensure correct identification of the associated data file. |
thumbnail | Hint that a VHIST browser with a suitable template can use this file as a small preview image (usually called "thumbnail", e.g. a typical file icon for image files). Currenty must be a JPEG or PNG image. |
thumbnailonly | Similiar to "thumbnail" but can be used to hint that this thumbnail image should be used in the context of the immediately following file and that no detailed information about this (illustrative) image is required (or, indeed, desired) as all relevant data will be supplied by the next image's attributes. CAVEAT: If the associated data file is of the type outfile, the thumbnail should also be of this type, similiar with the type infile. |
2.16 Pre-defined attributes (files)
-a <key> <value>
sets one or more pre-defined attributes. It is an error to
use other keywords then pre-defined attributes with this option.
An attribute describing a property of the previously specified in- or outfile.
Attribute | Brief Description |
---|---|
filetype | A user-defined text describing the file type. We suggest a convention reminiscent of the MIME [11] standard, e.g. binary/Analyze-Header . |
description | A short description of the in- or outfile. There is no formal size limit, however, we suggest not to use this option with text significantly longer then a few lines. If required, put the description in a file which can then be embedded. |
comment | Additional comments referring to the in- or outfile. We suggest to use this entry for observations during this particular workflow steps, e.g. unusually poor quality of data due to some hardware malfunction. |
md5file | The name of a file which contains an MD5 checksum [9] in the first line. The sum is used as user specified md5 sum for the in- or outfile. This attribute is useful in situations in which a file is not embedded and the checksum was already generated by another application, e.g. for very large files where caluclating MD5 sum is only feasible with low-priority background processes. |
2.17 User-defined attributes (files)
-u <key> <value>
sets a key value pair of your choice.
We distinguish between user-defined attributes and Pre-defined attributes (files)
-U "my key" "my value"
sets the
attribute
my key
to the value
my value
.
2.18 Verbosity
-v
or
--verbose
Use this option to have vhistadd
generate more verbose output to stdout.
-q
or
--quiet
If this option is set, the output to stdout is reduced to errors only.
2.19 Pretend Mode
-p
or
--pretend
If this option is set, vhistadd
only parses the commandline and generates the
workflow-step but does not generate a VHIST
file. This option is useful
for verifying a commandline.
2.20 Customization: PDF first page
-1
or
--firstpage
A file containing the content of the first page of a newly created VHIST
document. The content of the file must be encoded in UTF-8 [12]. The text can
contain Wiki-like markup which supports bold, italic and one coloured (blue)
font attributes. By default, vhistadd
will use res/title.txt
.
This option is ignored if the workflowstep is appended to an existing VHIST
file.
2.21 Customization: embedded readme
-r
or
--readme
A file containing the readme, which is embedded at the beginning of a newly
created VHIST
document. The content of the file must be encoded in an
ASCII compatible encoding, UTF-8 [12] is preferred. By default,
vhistadd
will use res/embedded_readme.txt
.
This option is ignored if the workflow step is appended to an existing VHIST
file.
2.22 vhistadd examples
Please note that the following examples assume that you are using the
"plain vanilla" Python version of vhistadd
. The important part of the
examples are the commandline parameters (everything after the initial
vhistadd.py
). We suggest you use bin/setup.py
for setting up paths and
links on unixoid platforms (Linux, Mac OS X).
The actual calling syntax of the vhistadd
tool depends on your platform and
configuration, e.g.
UNIX/Linux/Mac OS X, assuming vhistadd was installed in /usr/local/vhist:
/usr/local/vhist/bin/vhistadd.py ...
UNIX/Linux/Mac OS X, with a suitable PATH variable, or ALIAS
vhistadd.py ...
UNIX/Linux/Mac OS X, if the Python sources are (for some reason) not directly
executable:
python vhistadd.py ...
MS Windows without Python, assuming the VHIST executables were installed
in Programs (no Python distribution is then necessary):
C:\Program Files\VHIST\bin\vhistadd.exe
MS Windows with Python, assuming the VHIST executables were installed in
Programs (you need to install a Python distribution first):
cmd vhistadd.py
2.22.1 Example 1
The following examples will generate a new VHIST file hello.vhist
which
"documents" a simple application of the echo command. CAVEAT: vhistadd
will fail if hello.vhist
already exists (this is by design to prevent you
from unintentionally overwriting important data). You also need to take into
account the general problems of passing commandline arguments with special
characters (spaces, inverted commas) by escaping them (e.g. use "\!" if
your text contains an exclamation mark) and /or using a suitable type
of inverted commas.
vhistadd.py -s title "Hello VHIST" -s command "echo 'Hello VHIST'" - O hello.vhist
2.22.2 Example 2
This example demonstrates how to have vhistadd
add information about a
workflow step involving the co-registration of two image volumes
(which is an important task in multi-modality imaging of human brains).
We think the general idea about documenting this non-trivial task involving
multiple files can be easily transferred to other experimental setups.
Assuming we have a program my-coreg-tool
that is capable of performing
fully-automatic co-registration of image volumes and two image volumes
of the same patient from two different modalities (here: MRI and PET scans
of the same patient's brain). The toll will write a coreg.log
protocl
which we want to embed in the VHIST
file (usually logs are so small
that this is the recommended pratice) in addition to information about
the co-registration result (another image volue: regimage-pet.v
).
We have used the ECAT7 file type in these examples, which encodes one image volume with meta information in a single file; binary indicates that this file type does not store information in any "human-readable" form.
vhistadd.py
-s title "Co-Registration Task"
the workflow step's title
-s tool "my-coreg-tool v1.23"
the program's name and version
-s toolpath "/usr/bin/my-coreg-tool"
/the program's full path/
-i coreg.log -f embed
input file: co-registration log, will be embedded
-a filetype "text/log"
setting optional file type info on previous file
-i "pat-pet.jpg" -f no-embed
input file: PET image volume of patient, do not embed
-a filetype "binary/ECAT7"
setting optional file type info on previous file
-i "pat-mri.gif" -f no-embed
input file: MRI image volume of patient, do not embed
-a filetype "binary/ECAT7"
setting optional file type info on previous file
-o "regimage-pet.png" -f no-embed
output file: co-reg. PET image volume of patient
-a filetype "binary/ECAT7"
setting optional file type info on previous file
-O "regimage-pet.png.vhist"
VHIST output file (mandatory)
Please note that we have tried to keep this example brief. In terms of "Good Scientific Pratice" you can easily improve on this template by adding further attributes to the workflow step or individual files: Pre-defined attributes (workflow step), User-defined attributes (workflow step), User-defined attributes (files).
2.22.3 Example 3
Based on the output of the previous example we now want to apply a 3D image
filter to the registered image volume. Documentation of this next workflow step
should take advantage of the existing processing history, i.e. use the
VHIST
file generated in the previous step.
This is achieved by specifying the previous VHIST
as the VHIST root file: the
VHIST
file generated in this workflow step will then contain a copy of
the full previous history, the new meta information on the current process
(here: 3D filtering) is appended as a new section at the end.
# the workflow step's title
-s title "Gauss 3D Filter"= =# an optional description
-s description "apply filter to PET image"= =# the program's name and version
-s tool "my-filter-tool v2.34"= =# the program's full path
-s toolpath "C:\Programs and Files\my-filter-tool.exe
# optional attribute (key-value-pair)
-U "gauss-fwhm-mm" "2.3"= =# input file: image volume, do not embed
-i regimage-pet.png" -f no-embed
# output file: filter.log, will be embedded
-o "filter.log" - f embed
# the VHIST root file
-I "regimage-pet.v.vhist"= =# VHIST file to generate
-O regimage-filtered.v.vhist
3 vhistxs
The vhistxs commandline tool is part of the VHIST Core reference implementation and demonstrates the simplest approach to extract data from a VHIST file. This short program is embedded in each VHIST file: it is a part for the default "readme" which is located at the very beginning of each VHIST file.
vhistxs.py <vhistfile> Figure 6: Synopsis of commandline tool vhistxs.
4 vhistxl
A significantly more powerful tool, vhistxl
also allows for validation
of extracted data and sections of a VHIST
file. In particular, the
List Embedded Files options can be useful: consider it similiar to
the same functionality of the UNIX tar
command.
Built-in Help (vhistxl) | vhistxl.py -h or –help |
Get Version Information | vhistxl.py -V or –version |
List Embedded Files | vhistxl.py -t or –list <vhistfile> |
Validate Sections | |
Extract all files | |
Extract one file | |
Validate files | |
Specify extraction directory |
4.3 List embedded files
-t or --list <vhistfile>
will only list all embedded files and not extract any data.
4.5 Extract all files
-x or --extract
will extract all files to disk (implies List Embedded Files).
4.6 Extract one file
-r <fileid> or --extract-file-<fileid>
will extract the file with the id <fileid>
.
Use List Embedded Files to find the correct <fileid> of a particular file.
4.7 Validate files
-p or --pretend
will only decompress and test MD5 checkums [9] of embedded files, but not
write anything to disk.
5 vhistzard
vhistzard is a collection of graphical user interface tools which are designed to make casual working with VHIST easier. vhistzard has been written in C++ utilizing the Qt libraries, Version 4 from Trolltech [4] and runs natively on Windows, MacOS X and Linux.
It provides facilities to browse through existing VHIST files, modify existing ones and create new ones.
Figure 8: vhistzard in action, Here shown in the MS Windows incarnation.
5.1 Creating VHIST Arg-Files
5.1.1 Introduction
One component of vhistzard
has been designed for easy and comfortable creation
of commandline calls for use with vhistadd. It is an orthogonal effort to
acceptance for sites that have little scripting experience but we feel it
might be useful to experienced users as well.
The window of this module is divided into 4 tabs: Main, In/Out Files, VHIST files and Export. Each individual tab will be presented in the next subchapters.
5.1.2 Main
First, we'll enter general information including a title and description of the workflow step that we either want to append to an existing VHIST file or use to start a new VHIST file.
Although not technically limited in length, we recommend to keep the description short and embed any longer texts as simple text files.
Figure 9: Start with entering general information on a particular workflow step. The screenshot is from the MacOS X version.
5.1.3 In/out Files
Use this tab to specify which filse have been used in a workflow step. We distinguish between files that were used as input in this step and files that were created, i.e. files of type output. All files you have specified will be listed in the box at the bottom.
If you have a number of similiar files we sugesst you define one in detail and use the Duplicate File button to copy the settings and then focus on the differences.
Figure 10: Specify which files have been used in a particular workflow step - and what their functions were (input files vs output files)
5.1.4 VHIST files
There are two type of files you can specify using this tab.
- A VHIST root file is optional (upper part of the dialog) and refers to an existing VHIST file (usually from a previous workflow step). If specified, its contents will be copied to the new VHIST file, any new information from the current workflow step is appended so that neither the copied data nor the root file are modified.
- You need to specify at least one VHIST file that will be created during this workflow step (bottom part of the dialog). For your convenience, we have added some settings that will derive the filename(s) from files you might have specified in the In/Out Files tab.
Figure 11: Choose how many VHIST files should be generated. You can select one of the presets for frequently used configurations.
5.1.5 Export
All your settings of the previous tabs will converted to vhistadd commandline
options. You can copy them to the clipboard and paste them into a commandline
or other tool of your choice. Please also consider using the Arg-File option
and have vhistzard
write a file with auomatic comments, see Use an Arg-File.
You configure if the paths are meant to be interpreted as relative paths in this. The other option is to leave them as absolute paths, which is the default.
Figure 12: You can export all configured settings by clipboard to a commandline or other tool of your choice.
5.2 Running VHIST Arg-Files
vhistzard
also contains a tool which is deisgned to assist in the creation of
VHIST
files from existing VHIST
Arg-Files. It allows you to execute
vhistadd
without using a commandline.
Figure 13: A newly openened "Run VHIST Arg-File" Window (here shown for a Linux flavour of vhistzard. The message in the output box points users to the website, the documentation and the examples.
5.2.1 Selecting a VHIST Arg-File
To select an arg-file, just enter the name of the file in the field labelled with "Arg-File". You can also select a file by clicking on the browse button next to the text filed or by dragging a file from you file managed (e.g. Explorer on Windows, the Finder on Mac OS X) onto the dialog. You can set a custom current working directory by unchecking the checkbox below the text fields and entering a path into the text field labelled "current working directory". This however, is needed soldomly. See also What is a Current Working Directory and Why You Should (Not) Bother About It.
5.2.2 Viewing the File's Content
To view the contents of the currently selected file, click on the tab labelled "File Content". This view can be used to verify that the correct file was selected or to inspect someone else's file. To prevent that a spuriously selected file fills up the computer's memory, only files smaller than a certain size are displayed. This does not represent a problem since arg-files are usually not larger than several kilobytes in size.
Figure 14: The contents of the selected Arg-File is displayed in the textbrowser at the bottom of the window.
5.2.3 Executing a VHIST Arg-File
To "execute an Arg-File" here refers to running vhistadd
using the options
defined in the Arg-File (similiar in concept to running a powerful commandline
tool with a number of commandline options/arguments). A click on the "Run"
button will start vhistadd
using the selected Arg-File, the output is
displayed in the "Commandline Output" tab below the run-button.
Figure 15: The correct run of vhistadd is indicated by an error code of 0.
If an error or warning occurs during the execution, it is highlight in red or orange colour.
Figure 16: Errors, which occured during the execution of vhistadd are highlighted in red.
5.2.4 What is in a Current Working Directory and Why Should (Not) Bother About it
A call of vhistadd
usually contains references to several files using
file paths. These paths can be either "absolute" or "relative". An
absolute path specifies the exact position of the file for a particular
file system, e.g. C:\SomeDirectory
for file C:\SomeDirectory\MyFile
(MS Windows syntax). On UNIX/Linux systems a similiar example is the
absolute path /usr/local/data/mydir
for file /usr/local/data/mydir/MyFile
.
This way of referencing is unambiguous, however, can be quite inflexible if
you reorganize data.
Using relative paths is an alternative: a relative path is specified in
relation to a reference directory, also known as currenct working directory
(CWD). Therefore, it is important to set the CWD correctly or otherwise files
referenced with relative pathes are not found (or, worse, wrong files might
be used). Indeed, it is good practice to specify all relative pathes inside
an Arg file relative to the Arg file itself. This means that you usually do
not have to care about the CWD and just set it to the directory, in which your
Arg file is located. This is also the default option in the run dialog of
vhistzard
.
5.2.5 Testing the Examples with vhistzard
The examples described in vhistadd - Examples are alos part of all
VHIST
distributions and located in the examples
directory. Testing
them on your system is quite easy.
Figure 17: To install the examples, select the "Install Examples" options from the "Extras" menu.
- Select "Install Examples" from the "Extras" menu.
- In the following dialog, select a directory into which the examples should be installed. A directory "Examples" will be created inside the selected directory.
-
If the directory already exists,
vhistzard
will ask if it should be overwritten. - In the next dialog, you can select one of the examples files.
- To run the example, just press the "Run" button.
- The generated VHIST file is stored in the directory in which the Arg file is located.
5.3 Viewing VHIST files
One part of vhistzard
is a tool to view VHIST files nad to extract
embedded files from them in a user-friendly manner. This tool provides
a template-based preview ("themes") of the information contained insed
the VHIST file (in HTML format) and allows for easy extraction of
single files or complete workflow steps.
Figure 18: The outline on the left-hand side is used for navigation inside the VHIST file as well as extraction of embedded files. The textbrowser next to it shows detailed information about the individual sections and files inside the VHIST file.
5.3.1 Opening a VHIST File
It is quite simple to open a VHIST
file. Select "Open" from the
"File" menu and navigate to the file to open it or just drag the file
from you file manager (e.g. Explorer on Windows, the Finder on Mac OS
X) onto the dialog. vhistzard
will read the file and present you an
outline of its content as well as augmented information about the
files and sections.
5.3.2 Navigating Inside a VHIST File
To navigate inside a VHIST file, you can either use the outline on the left-hand side of the dialog or scroll through the textual information on the right-hand side. To jump to one special section or file, double click on the entry in the outline.
5.3.3 Extracting Embedded Files
To extract one or several files from a VHIST file, select the files in the outline and click on the "extract" button. For each selected file, a "Save File" dialog will appear. Embedded files are marked by a paper clip icon next to the filename in the outline.
5.3.4 Selecting a Template/Theme
You can use the vhistzard
to view the summary of your VHIST file by means
of various templates ("themes"). Different templates show information
in different granularties and present them in different ways. To select
a template, click on the drop down list about the text viewer.
5.3.5 Technical information on Templates/Themes
A template is a XHTML2 [13] file which can be viewed by any HTML browser.
It usually contains keywords (e.g. $COMMENT
) which will then be replaced
by the corresponding content from the current VHIST
file, see below for
a list of keywords which will be available when the template is being parsed.
User-defined attributes (workflow step) are prepended with $USR::
and
converted to upper case, so for a key pair ("MyKey", "MyValue") the placeholder
$USR::MYKEY
will yield the value "MyValue". We strongly suggest to use
the dump-all.htm
tempalte to generate a list of all available keywords for
a give VHIST
file before modifying a template. Keywords may contain
alphanumerical characters, "_" and ":" (but not as the last character).
The template can provide additional information on how information from the
VHIST
file should be rendered in HTML (or PDF) which can depend on the
existence or values of certain entries. In addition, it is possible to define
blocks of arbitrary HTML commands and use them as building blocks if certain
conditions are met. In order to remain HTML compatible, we use HTML/XML
comments and evaluate them in our own parser.
For each section/workflow step of the VHIST file, the template will be
processed with a set of keywords for that particular workflow step, so e.g.
$SECTION
will contain the current section's name. There are named blocks
with reserved names: __top__
and __bottom_
are only
evaluated once and __files__
is evaluated for each file (of each
section/workflow step).
In order to provide maximum flexibility, you can access all properties
of all files of a section from the template. Keywords such as $FILEPATH[3]
are
available which in this case will refer to the path of the third file of that
section. If you write $FILEPATH[*]
, the "*" will match the current file's index.
Similiarities to the C/C++ preprocessor syntax are intended in the following list of commands for HTML/PDF generation:
-
#ifdef $KEYWORD
- If$KEYWORD
has a value (has been set in the VHIST file) the next HTML block will be used, otherwise it will be skipped. -
#ifndef $KEYWORD
- If$KEYWORD
has a value (has been set in the VHIST file) the next HTML block will be skipped, otherwise it will be used. -
#ifequal $KYWORD value
- If$KEYWORD
is equal to value, the next HTML block will be used, otherwise it wll be skipped. -
#else
- May only be used in conjunction with one of the#if
statements. -
#begin(<name>)
- Defines the start position of a named block. -
#end(<name>)
- Defines the end position of a named block. -
#showblock(<name>)
- Inserts a named block at this position. -
#evenodd begin
- Starts processing for$EVENODD
keywords. -
#evenodd end
- Ends processing for$EVENODD
keywords.
In the following example, the first and last line are HTML/XML comments
which will be ignored by any HTML browser. When used in a template, the
second line will only be used for rendering HTML/PDF output if the keyword
$COMMENT
has been set (which is the case if the current section/workflow
step files contains a comment).
<!-- #ifdef $COMMENT -->
Comment: $COMMENT<br/>
<!-- #endif -->
The next example demonstrates how to set content differently for HTML and for PDF generation
<!-- #ifndef $ISPDFGEN -->
this will only appear in the HTML version
<!-- #else -->
this will only appear in the PDF file
<!-- #endif -->
The following special keywords are available (in alphabetical order):
-
$EVENODD
- Processing of$EVENODD
is activated between occurances of#evenodd start
and#evenodd end
. The keyword will be replaced by the strings "odd" and "even", resp. These terms will be used in an alternating fashion, starting with "odd". This feature is generally useful when defining tables with rows that should have an alternating background color and it is required when the table's rows depend on some conditions which are not fully known when designing the table's layout. Templateblack-and-white.htm
makes use of this feature. -
$FILEHASUSRATTRIBUTES[*]
- Will be set to "TRUE" if any User-defined attributes (files) have been defined for the workflow step, otherwise this keyword is undefined. User-defined attributes can be accessed with keywords of the form$USR::MYKEY[*]
. -
$ISPDFGEN
- Will be set to "TRUE" if content is generated for a PDF file, otherwise this keyword is undefined. -
$ISFIRSTSECTION
- Will be set to "TRUE" if the current section is the first section of the output (either HTML or PDF), otherwise this keyword is undefined. -
$PREVIEWID[*]
- Contains the ID of the current file's preview image if one was defined, see Pre-defined flags (files), otherwise this keyword is undefined. -
$THUMBNAILID[*]
- Contains the ID of the current file's thumbnail image if one was defined, see Pre-defined flags (files), otherwise this keyword is undefined. -
$VHISTFILE
- The file name (without the full path) of the selected VHIST file. -
$VHISTFILESIZEINBYTES
- The size of the selected VHIST file in Bytes. -
$VHISTFILESIZEINMB
- The size of the selected VHIST file in MB. -
$VHISTFILELASTMODIFIED
- The last-modified-date of the VHIST file. -
$WSHASUSRATTRIBUTES
- Will be set to "TRUE" if any User-defined attributes (workflow step) have been defined for the workflow step, otherwise this keyword is undefined. User-defined attributes can be accessed with keywords of the form$USR::MYKEY
. -
$WSPREVIEWID
- Contains the ID of the section's preview image if one was defined, see Pre-defined flags (files), otherwise this keyword is undefined.
Footnotes:
[1] S. Vollmar, A. Hüsgen, M. Sué, The VHIST Homepage (2009) URL http://www.nf.mpg.de/vhist
[2] S. Vollmar, A. Hüsgen, M. Sué, M. May, R. Krais, Workflow Histories and Image Data with Validation, Abstracts of the XI Turku PET Symposium (2008) p. 108. URL http://www.pet.fi/files/PET2008_book_of_abstracts.pdf
[3] Python Programming Language Official Website. URL http://www.python.org
[4] Qt-Trolltech URL http://trolltech.com/products/qt
[5] Adobe Corp., Adobe Systems Incorporated, PDF Reference, fourth edition, Adobe Portable Document Format Version 1.5. URL http://www.adobe.com/devnet/pdf/pdf_reference.html
[6] GNU General Public License (GPL). URL http://www.gnu.org/licenses
[7] Microsoft Inc., Command prompt (Cmd.exe) command-line string limitation. URL http://support.microsoft.com/kb/830473/EN-US
[8] GNU Project, BASH - GNU Project - Free Software Foundation (FSF). URL http://www.gnu.org/software/bash
[9] R. Rivest, The MD5 Message-Digest Algorithm, RFC 1321.
[10] L.P. Deutsch, DEFLATe Compressed Data Format version 1.3. URL http://www.ietf.org/rfc/rfc1951.txt
[11] W3C, RFC 2046, Multipurpose Internet Mail Extensions (MIME) Part Two: Media. URL http://www.ietf.org/rfc/rfc2781.txt
[12] W3C, Unicode (UTF-8, UTF-16) URL http://www.ietf.org/rfc/rfc2781.txt
[13] World Wide Web Consortium, XHTML2 Working Group Home Page. URL http://www.w3.org/MarkUp
Date: 2010-01-26
HTML generated by org-mode 6.34 in emacs 23