Starting my thesis journey felt a lot like stepping into a massive, unfamiliar jungle — exciting but also terrifying.
I knew from the beginning that I wanted to do something with computer vision .
But computer vision is such a vast ocean: detection, segmentation, classification, enhancement — where do you even begin?
At first, I did what most people do: I read. I read papers like they were treasure maps, each promising a path toward some “undiscovered land” of innovation.
But the deeper I went, the more I realized: everything felt taken .
Every idea I thought about seemed to already have five papers, a GitHub repo, and a full-fledged competition around it.
There were moments when I genuinely wondered:
“Am I even capable of finding a gap?”
At one point, I even started doubting the process. Maybe “finding a gap” wasn’t about discovering an untouched continent — maybe it was about noticing a small crack in an existing structure.
And that’s when something clicked.
While exploring the area of semantic segmentation , especially for UAV (drone) imagery used in post-disaster scenarios, I noticed a consistent pattern:
Researchers focused heavily on improving accuracy, squeezing out that extra 1–2% on benchmark datasets.
But few people talked seriously about computational efficiency , especially for deployment on edge devices where power, memory, and speed matter.
It wasn’t a grand gap. It wasn’t a revolutionary field.
But it was real .
And sometimes, that’s more than enough.