README ====== .. role:: raw-html-m2r(raw) :format: html .. raw:: html

PROVEE - PROgressiVe Explainable Embeddings ------------------------------------------- .. image:: https://img.shields.io/badge/status-active-success.svg :target: https://git.science.uu.nl/vig/provee/provee-local-projector/ :alt: Status .. image:: https://git.science.uu.nl/vig/provee/provee-local-projector/badges/master/pipeline.svg :target: https://git.science.uu.nl/vig/provee/provee-local-projector/-/pipelines :alt: Gitlab pipeline status .. image:: https://git.science.uu.nl/vig/provee/provee-local-projector/badges/master/coverage.svg :target: https://git.science.uu.nl/vig/provee/provee-local-projector/-/graphs/master/charts :alt: Gitlab coverage .. image:: https://readthedocs.org/projects/provee-local-projector/badge/?version=latest :target: https://provee-local-projector.readthedocs.io/en/latest/?badge=latest :alt: Documentation Status .. image:: https://img.shields.io/badge/%20%20%F0%9F%93%A6%F0%9F%9A%80-semantic--release-e10079.svg :target: https://github.com/semantic-release/semantic-release :alt: semantic-release .. image:: https://img.shields.io/badge/license-MIT-blue.svg :target: /LICENSE :alt: License ---- .. raw:: html

Deep Neural Networks (DNNs), and their resulting **latent or embedding data spaces, are key to analyzing big data** in various domains such as vision, speech recognition, and natural language processing (NLP). However, embedding spaces are high-dimensional and abstract, thus not directly understandable. We aim to develop a software framework to visually explore and explain how embeddings relate to the actual data fed to the DNN. This enables both DNN developers and end-users to understand the currently black-box working of DNNs, leading to better-engineered networks, and explainable, transparent DNN systems whose behavior can be trusted by their end-users. Our central aim is to open DNN black-boxes, making complex data understandable for data science novices, and raising trust/transparency are core topics in VA and NLP research. PROVEE will advertise and apply VA in a wider scope with impact across sciences (medicine, engineering, biology, physics) where researchers use big data and deep learning.

📝 Table of Contents -------------------- * `About <#about-a-name-about-a>`_ * `Feature/Performance Comparison <./COMPARISON.html>`_\ * `Getting Started <#getting-started-a-name-getting-started-a>`_ * `Running Tests <#running-the-tests-a-name-tests-a>`_ * `Usage <#usage-a-name-usage-a>`_ * `Deployment <#deployment-a-name-deployment-a>`_ * `Built Using <#built-using-a-name-built-using-a>`_ * `Contributing <./CONTRIBUTING.html>`_ * `Authors <#authors-a-name-authors-a>`_ * `Acknowledgments <#acknowledgements-a-name-acknowledgement-a>`_ 🧐 About :raw-html-m2r:`` --------------------------------------------------- In this repository you will find PROVEE, short for Progressive Explainable Embeddings, a visual-interactive system for representing the embedding data spaces in a user-friendly 2D projection. The idea behind `Progressive Analytics `_\ , such as described e.g. by Fekete and Primet, is to provide a rapid data exploration pipeline with a feedback loop from the system to the analyst with a latency below about 10 seconds. Research has shown that when performing exploratory analysis humans need a latency below about 10 seconds to remain focused and use their short-term memory efficiently. Therefore, PROVEE's goals are (1) to provide increasingly meaningful partial results as the algorithms execute and (2) provide visualizations that minimize distractions by not changing views excessively. All of this with a high scalability of the input data in combination with memory efficiency. *Note that these goals are adapted from the aforementioned publication.* PROVEE's architecture includes (1) analysis algorithms (particularly, incremental projection algoritms like IPCA), (2) intuitive, local user interfaces/visualizations and (3) intermediate data storage and transfer. Core to our system is an innovative, progressive analysis workflow targeting a human-algorithm feedback-loop with a latency under ~10 seconds to maintain the user's efficiency during exploration tasks. PROVEE will be scalable to big data; generic (handle data from many application domains); and easy to use (requires no specialist programming from the user). Please also refer to our `Performance and feature comparison <./COMPARISON.html>`_ to see the available (visualization and analysis) tools that we used as to compare PROVEE to. .. image:: https://git.science.uu.nl/vig/provee/provee-local-projector/uploads/a66c1f70d88dde26c2364ab149eb6ff2/proveeGIF2.gif :target: https://git.science.uu.nl/vig/provee/provee-local-projector/uploads/a66c1f70d88dde26c2364ab149eb6ff2/proveeGIF2.gif :alt: proveeGIF2 🏁 Getting Started :raw-html-m2r:`` ----------------------------------------------------------------------- These instructions will get you a copy of the project up and running on your local machine for development and testing purposes. See `deployment <#deployment>`_ for notes on how to deploy the project on a live system. Prerequisites ^^^^^^^^^^^^^ .. raw:: html * Python 3 * `Conda `_\ Installing ^^^^^^^^^^ Clone the latest Provee directory from Gitlab .. code-block:: git clone https://git.science.uu.nl/vig/provee/provee-local-projector.git Create a conda environment. Your new environment will be named 'provee'. .. code-block:: conda env create -f environment.yml Activate the environment. .. code-block:: conda activate provee To deactivate the environment, use .. code-block:: conda deactivate Running ^^^^^^^ To run the project, first activate the conda environment. Afterwards run ``main.py`` while in LocalProjector. .. code-block:: cd LocalProjector\ python main.py 🔧 Running the tests :raw-html-m2r:`` --------------------------------------------------------------- The tests can be found under the folder 'LocalProjector/test'. Basic Unit Tests ^^^^^^^^^^^^^^^^ Tests can be run using pytest. First activate the conda environment. From the root folder, tests can be run using: .. code-block:: pytest LocalProjector/test/ To enable coverage, use: .. code-block:: pytest --cov=LocalProjector/src/ LocalProjector/test/ 🎈 Usage :raw-html-m2r:`` ------------------------------------------------- Notes about how to use the system are TBD, Video coming soon. 🚀 Deployment :raw-html-m2r:`` ------------------------------------------------------------- If you want to deploy a live system refer to the `Deployment Guide <./DEPLOYMENTGUIDE.html>`_. ⛏️ Built Using :raw-html-m2r:`` --------------------------------------------------------------- * `Vispy `_ - 2D visualization * `Faiss `_ - K-Nearest Neighbours * `gRPC `_ - Microservices * `PyQt5 `_ - Signaling & Service ✍️ Authors :raw-html-m2r:`` ------------------------------------------------------- * `Michael Behrisch `_ - Idea & Initial work * `Alex Telea `_ - Idea & Initial work * `Dong Nguyen `_ - Idea & Initial work * `Simen van Herpt `_ - Backend & Infrastructure * `Dennis Owolabi `_ - Code management & Infrastructure * `Sinie van der Ben `_ - Comparison & Faiss See also the list of `contributors `_ who participated in this project. 🎉 Acknowledgements :raw-html-m2r:`` ------------------------------------------------------------------------ * Hat tip to anyone whose code was used