You can admin the search engine (i.e. crawling directories)
via web admin interface
via REST-API or
via command line tools
Admin
Getting started
Install
Config
Command line tools
REST-API
Client intergration
Processing
Architecture overview (Components & modules)
Data integration: Crawling, extraction and import (ETL)
Document processing, extraction, data analysis and data enrichment chain
Exclude (Blacklisting)
Data enrichment and data analysis (Enhancement)
Automated tagging and filtering (Rules and named entities extraction)
Scaling and optimization for faster indexing (parallel processing and search cluster)
Export enriched and structured data
Connectors (Import)
Files and directories (Filesystem or fileserver)
Newsfeed (RSS-Feed)
Website (HTTP)
Database (SQL)
E-Mail (IMAP)
Linked Data (RDF Graph)
Wiki (Mediawiki)
Hypothesis Annotations
Extract strucutured data from websites (Web scraper)
Generic (other connectors, protocols and formats)
Data enrichment & data analysis (Enhancer)
Automatic textrecognition (OCR)
Metadata from Resource Descriptions (RDF)
Automated tagging (Rules and named entities extraction)
Archive files (ZIP)
XMP sidecar files (XMP)
CSV-Spreadsheets (CSV)
Locations (GeoNames)
Text patterns (Regular expressions)
Named Entitiy Recognition (NER)
More data enrichment engines
Development of own data enrichment plugins
Trigger
Filesystem monitoring