Hi, I’m Gergely Dinya

A computer scientist specializing in artificial intelligence, with an MSc and BSc from Eötvös Loránd University. My research interests include 3D computer vision, scene understanding, embodied AI, and image and video processing.

My work combines research and engineering in areas such as multi-camera tracking, semantic SLAM, interactive video segmentation, and annotation tools, with additional experience in game development and computer graphics.

Projects

Titan Game Development Framework

A lightweight custom game development framework written in C++ with OpenGL-based rendering. The codebase is available on GitHub, with documentation and demo projects in progress.

C++ OpenGL GLSL Game Development
SceneVGGT

SceneVGGT is a spatio-temporal 3D scene understanding framework that combines SLAM with semantic mapping. It supports online, near-real-time processing of streamed data with fixed VRAM usage regardless of input length, making it well suited for online tasks such as autonomous and assistive navigation.

3D Computer Vision Semantic SLAM Assistive Navigation
AviTrack

Research and engineering work on multi-animal, multi-camera synchronization and instance tracking for avian behavior analysis of swimming birds in a complex floating aviary setup.

Computer Vision Tracking Behavior Analysis

Publications

2025 Ecological Informatics

Multi-Camera Synchronization and Instance Tracking for Avian Behavior Analysis in a Floating Aviary

Gergely Dinya*, Anna Gelencsér-Horváth*, Andrea Ferretti, Niels C. Rattenborg, András Lőrincz. Equal contribution first authors.

Read paper
2026 Accepted for ICIP 2026

SceneVGGT: VGGT-based Online 3D Semantic SLAM for Indoor Scene Understanding and Navigation

Anna Gelencsér-Horváth*, Gergely Dinya*, Dorka Boglárka Erős, Péter Halász, Islam Muhammad Muqsit, Kristóf Karacs. Equal contribution first authors.

Read paper
2026 Journal of Open Research Software

SAMannot: A Memory-Efficient, Local, Open-source Framework for Interactive Video Instance Segmentation Based on SAM2

Gergely Dinya, András Gelencsér, Krisztina Kupán, Clemens Küpper, Kristóf Karacs, Anna Gelencsér-Horváth.

Read paper
2025 Preprint

Building Temporally Coherent 3D Maps with VGGT for Memory-efficient Semantic SLAM

Gergely Dinya, Péter Halász, András Lőrincz, Kristóf Karacs, Anna Gelencsér-Horváth.

Read paper
2025 Preprint

Automatic Camera Orientation Estimation for a Partially Calibrated Camera Above a Plane with a Line at Known Planar Distance (Supplementary Material)

Gergely Dinya and Anna Gelencsér-Horváth.

Read paper

Skills

Programming

Python, C/C++

Research areas
3D computer vision Scene understanding Semantic SLAM Embodied AI Image and video processing Video instance segmentation Multi-camera tracking
ML / CV
PyTorch OpenCV NumPy SAM2 VGGT
Computer Graphics
OpenGL GLSL Steam API
Tools
Git LaTeX ROS
Languages

English (C1), Hungarian (native)

Get in touch

I am open to research collaborations, technical projects and interesting software engineering opportunities.