Exploring OCR and Visual Language Models in the context of complex graph comprehension. Building a modular architecture that can swap out models as well as evaluate their performance on graphs of oil and gas wells.
Solves the problem of Multimodal and OCR models not comprehending graph and tabular content. OCR models are good at reading text, even when its blurry or grainy, but reading grainy and at times even hand-drawn graphs is even more tricky.
Google's Gemini has become the defacto model for graph comprehension, but as many have noticed, it easily gets pricey. With this said, we believe a potential combination of OCR-models, Visual Language Models and sophisticated scaling, resizing and skewing would be even more efficient.

Dashboard showing the modular design for model evaluation

GNOSIS architecture overview showing the modular design for model evaluation
Start Date
November 2025
Current Phase
In development
Upcoming Milestones
Wellvector
The goal is to deploy as pay-per-use with no profit, publish as docker image and separate evaluation components into an eval suite for visual AI.
Niklavs Visockis
Tech Lead
Georg Zsolnai
Backend
Michael Yu
Backend
Sebastian Schmülling
Inference
Elias Lindstenz
Inference
Giulio Altomari
Fullstack