Installation¶

Install the requirements:

pip install -r requirements.txt

Install the library:

pip install git+https://github.com/cedadev/facet-scanner

Basic Usage¶

This code can be used to bulk process a dataset for testing and initialisation:

usage: facet_scanner [-h] [--rerun] [--num-files NUM_FILES] [--conf CONF]
                     path processing_path

Process path for facets and update the index

positional arguments:
  path                  Path to process
  processing_path       Path to output intermediate files

optional arguments:
  -h, --help            show this help message and exit
  --rerun               Disable paging to disk on rerun
  --num-files NUM_FILES
                        Number of files per lotus job
  --conf CONF

The script uses your supplied path and queries elasticsearch for all the files under this point. The --num-files flag sets the page size and determines how many files end up in each lotus batch job.

Contents:

API:

Welcome to CEDA Facet Scanner’s documentation!¶

Installation¶

Basic Usage¶

Indices and table¶