Skip to content

vojdam/invenio_extractor

Repository files navigation

InveNIO Extractor

A Flask-based web application for extracting, storing, browsing, editing, and downloading metadata from DICOM images generated by the stimulated Raman histology (SRH) system, InveNIO.

The application reads DICOM files from a configured image directory, extracts selected metadata using pydicom, stores the metadata in a SQLite database, generates thumbnails and TIFF derivatives for generated SRH images, and provides a browser-based interface for reviewing metadata and opening the corresponding images.

Main features

  • Extracts DICOM metadata from folders of InveNIO/stimulated Raman histology images.
  • Stores image/session metadata in a SQLite database.
  • Displays studies, image entries, specimen metadata, and custom user-defined fields in a Flask web interface.
  • Supports metadata search.
  • Provides an editor for generated metadata and custom data fields.
  • Generates JPEG thumbnails for generated SRH images.
  • Generates TIFF derivatives for generated SRH images.
  • Allows DICOM and TIFF downloads, with optional DICOM anonymiziation.
  • Can be run locally with a Python virtual environment or in Docker with Gunicorn.

Repository layout

.
β”œβ”€β”€ extractor_app/
β”‚   β”œβ”€β”€ __init__.py              # Flask application factory and route registration
β”‚   β”œβ”€β”€ commands.py              # Flask CLI commands: init-db, update-db
β”‚   β”œβ”€β”€ config_handler.py        # config.ini reader
β”‚   β”œβ”€β”€ db.py                    # SQLite connection and schema initialization
β”‚   β”œβ”€β”€ file_exporter.py         # DICOM/TIFF export and anonymization
β”‚   β”œβ”€β”€ image_generator.py       # Thumbnail and TIFF generation
β”‚   β”œβ”€β”€ metadata_extractor.py    # DICOM metadata extraction and database writing
β”‚   β”œβ”€β”€ schema.sql               # SQLite schema
β”‚   β”œβ”€β”€ static/                  # JavaScript, generated Resources, OpenSeadragon assets
β”‚   β”œβ”€β”€ templates/               # Flask/Jinja templates
β”‚   └── views/                   # Flask blueprints
β”œβ”€β”€ config.ini                   # Runtime configuration
β”œβ”€β”€ Dockerfile                   # Docker image definition
β”œβ”€β”€ gunicorn_config.py           # Gunicorn configuration
β”œβ”€β”€ requirements.txt             # Runtime Python dependencies
β”œβ”€β”€ setup.sh / setup.bat         # Local setup helpers
β”œβ”€β”€ run.sh / run.bat             # Local development run helpers
β”œβ”€β”€ update_db.sh / update_db.bat # Manual database update helpers
└── reset_db.sh / reset_db.bat   # Database reset helpers

Configuration

Make sure to edit the paths inside the config.ini file!

Configuration fields

Field Meaning
PathToDatabase Path to the SQLite database file.
PathToImagesFolder Directory containing folders of DICOM images.
ExtractedFilePath Base output path for extracted/generated files.
NumberOfImgSlices Number of horizontal slices used when rendering large DICOM images directly in the browser.
ThumbnailARScale Thumbnail scaling parameter.
CustomData Comma-separated list of editable custom metadata fields.
AllowedIPs Comma-separated list of client IP addresses allowed to access the app.

Expected image layout:

instance/images/
β”œβ”€β”€ <study-or-session-folder>/
β”‚   β”œβ”€β”€ <image_1>.dcm
β”‚   β”œβ”€β”€ <image_2>.dcm
β”‚   └── ...
└── <another-study-or-session-folder>/
    └── ...

Local installation

Clone the repository:

git clone https://github.com/vojdam/invenio_extractor.git
cd invenio_extractor

Edit the config.ini file and specify paths! Also change the secret key in __init__.py!

Run the setup.sh or setup.bat helper script to create a virtual environment, initialize and update the database.

Run the run.sh or run.bat helper script to run the web app.

Docker usage

Build the image:

docker build -t invenio_extractor:latest .

Run the container:

docker run -d \
  --name invenio_extractor \
  -p 8080:8080 \
  -v /path/to/local/instance:/invenio_extractor/instance \
  invenio_extractor:latest

Then open:

http://localhost:8080/

Mounting the whole instance/ directory is recommended so that both the input images and the SQLite database persist outside the container.

If you only mount the image folder, for example:

-v /path/to/images:/invenio_extractor/instance/images

the database remains inside the container filesystem and will be lost when the container is removed.

Flask CLI commands

Initialize or reset the SQLite database:

python -m flask --app extractor_app init-db

Update the database from the configured image folder:

python -m flask --app extractor_app update-db

Renew extraction from all folders:

python -m flask --app extractor_app update-db --renew

License

This project is distributed under the MIT License. See LICENSE for details.

About

Flask app for extracting and viewing data and images generated by InveNIO

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors