A Python-based local, open-source video annotation tool built on Meta’s Segment Anything Model 2, enabling users to create high-quality segmentation masks across video frames with minimal interaction.
Hi, I’m Gergely Dinya
A computer scientist specializing in artificial intelligence, with an MSc and BSc from Eötvös Loránd University. My research interests include 3D computer vision, scene understanding, embodied AI, and image and video processing.
My work combines research and engineering in areas such as multi-camera tracking, semantic SLAM, interactive video segmentation, and annotation tools, with additional experience in game development and computer graphics.
Projects
A 2D puzzle game written in C++ using a custom game development framework.
A lightweight custom game development framework written in C++ with OpenGL-based rendering. The codebase is available on GitHub, with documentation and demo projects in progress.
SceneVGGT is a spatio-temporal 3D scene understanding framework that combines SLAM with semantic mapping. It supports online, near-real-time processing of streamed data with fixed VRAM usage regardless of input length, making it well suited for online tasks such as autonomous and assistive navigation.
Research and engineering work on multi-animal, multi-camera synchronization and instance tracking for avian behavior analysis of swimming birds in a complex floating aviary setup.
Publications
Multi-Camera Synchronization and Instance Tracking for Avian Behavior Analysis in a Floating Aviary
Gergely Dinya*, Anna Gelencsér-Horváth*, Andrea Ferretti, Niels C. Rattenborg, András Lőrincz. Equal contribution first authors.
Read paperSceneVGGT: VGGT-based Online 3D Semantic SLAM for Indoor Scene Understanding and Navigation
Anna Gelencsér-Horváth*, Gergely Dinya*, Dorka Boglárka Erős, Péter Halász, Islam Muhammad Muqsit, Kristóf Karacs. Equal contribution first authors.
Read paperSAMannot: A Memory-Efficient, Local, Open-source Framework for Interactive Video Instance Segmentation Based on SAM2
Gergely Dinya, András Gelencsér, Krisztina Kupán, Clemens Küpper, Kristóf Karacs, Anna Gelencsér-Horváth.
Read paperBuilding Temporally Coherent 3D Maps with VGGT for Memory-efficient Semantic SLAM
Gergely Dinya, Péter Halász, András Lőrincz, Kristóf Karacs, Anna Gelencsér-Horváth.
Read paperAutomatic Camera Orientation Estimation for a Partially Calibrated Camera Above a Plane with a Line at Known Planar Distance (Supplementary Material)
Gergely Dinya and Anna Gelencsér-Horváth.
Read paperSkills
Python, C/C++
English (C1), Hungarian (native)
Get in touch
I am open to research collaborations, technical projects and interesting software engineering opportunities.