Skip to content

Information on Session Table

Anne Krause edited this page Mar 27, 2018 · 9 revisions

In order to process incoming data, the indexer needs reading access to the file systems or folders to be ingested.

In case of carrier based volumes like floppy disks, optical disks or hard drives, you are expected to create forensically correct sector images from the physical volumes first using ddrescue or similar tools.

These images then are made available in read-only mode by the indexer using the mount-command defined in mount. (The filesystem type is irrelevant, as long as it is supported by Linux.)

The disk images are mounted to a temporary mount point (folder) specified as mountpoint.

It is also possible to work on an incoming folder without mounting a sector image first; in this case simply provide the ingest folder both as datapath and mountpoint.

Incoming volumes or folders are called “Sessions” by the indexer. All basic information about the sessions will be held in the “session table” with the following contents.

bestandid

This unique collection ID is for multitenancy. You can add as many multiple tenants as you need for your indexer projects.

sessionid

The indexer needs a unique “session ID” for every incoming volume or data package. Most indexer scripts take a sessionid as an argument to work on.

name

If you already have a numbering or naming scheme for your volumes, you can name them here.

datapath

The full qualified path (including absolute folder hierarchy, filename and extension) of a sector image to be mounted or the full qualified path of an incoming folder to work on.

localpath

The localpath is an internal base path where the indexer-software saves a copy of all incoming data using an artificial name. It includes the bestandid and the group.

mountpoint

Temporary mount point for sector images (like /mnt) or identical to datapath in case of a file based ingest.

mount

The column mount represents the complete command line for mounting an image, including any options. Placeholders $$IMAGE$$ and $$MOUNTPOINT$$ can be used for the content of datapath and mountpoint. If no image must be mounted leave this field blank.

umount

The umount column represents the complete command line for un-mounting an image, including any options. The placeholder $$MOUNTPOINT$$ can be used for the content of mountpoint. If no image must be mounted and unmounted, leave this field blank.

fscharset

This column represents the character set of the incoming filesystem.

description

This is an optional column for further describing your volumes.

group

In the group column you additionally specify the class of the volume to build large groups like “fd” for floppy disks, “od” for optical disks or “hd” for hard drives. Most indexer scripts which take a sessionid as an argument also accept a group name to work on a set of sessions/volumes.

solrpath

Points to the solrcore.

solrtime

Indicates the time at which the session’s metadata was transferred into SOLR.

ingesttime

Indicates the time when the volume was successfully ingested into the file table with recurse.php.

ignore

You can exclude volumes from being processed by entering a “1” into the ignore column. Useful for bad volumes which cannot be mounted successfully.