A password will be e-mailed to you.

How Do Robots See?

Credits: University of Maryland

In 2018, we’re surrounded by tech evolutions all around us. From the smartphone to artificial intelligence, we as a species have become extremely reliant on it too. A significant albeit, still nascent development in this tech-driven world, is robotics. The world is looking for solutions through robots and there is a craze buzzing in the air. A significantly sized robot revolution is said to be underway.

This revolution is speculated to overhaul the technological and overall economy in the next two decades. And it will not just take up a niche part of our environment, i.e., factories or labs. Robots will be omnipresent – from being at your nearest McDonalds taking orders to replacing your stylist for your haircut.

All these robots will be very different in look, feel, and functions, but there are certain characteristics that have to be present in all robots, i.e., analyzing a situation and acting on it using its inbuilt software.

There is a lot of things we as humans take for granted without realizing. When performing certain tasks, we’re interpreting it, analyzing the outcomes, making decisions about what variables work best together, and then take action towards it.

But before all of these, there is something very essential we do that comes effortlessly – we look. We look at the situation, problem, or the surrounding by using reflected light from a source and then decide how to tackle it. These visual cues give us the ability to tangibly assess a situation.

So, now going back to our previous discussion, robots are meant to analyze and solve situations. But whilst doing so, how do they see? More so, how do they see to understand enough about the situation? Let’s take a deeper dive.

First things first – robots today don’t have the ability to actually perform the action of seeing in its conventional sense. The way we perform the action is not the way they do. And technological advancements that can make them do so are far in the future. So then, what do they actually do when they are seeing?

The hardware part is easy. Robots are fitted with cameras. The robotics camera field is a huge area of development but for now, we can think of it like a phone camera for ease of understanding. These cameras grab light from the surrounding area and take images of the things around. These images can be singular instances like snapshots or can be a string of images and frames which then form a moving image or a video.


But the tricky part comes when the robot is confronted with what to do with the image when it has been taken.


Since the 70’s, robotics engineers have dissected an image to what it actually is in its essence – a collection of pixels. Each of these pixels has a color on the RGB spectrum and then joins other pixels to form something useful. Programmers take the data on each of these pixels, note it for color, contrast, shadow, highlight, and then code a suitable course of action into the robot.

However, there was a problem. Pixels provide way too much data for a small simple task. And a few decades ago, storing this data efficiently was also an issue. So, to tackle this, engineers started coding algorithms for universal features like lines, corners, and textures. Soon, there are patterns of images which have the same features and similar actions can be taken for images following the same pattern. This reduces the data from millions of pixels to a few hundred feature-based algorithms.


Like a human, a robot doesn’t need everything in its surrounding to be seen. It needs just enough to perform a function. For example, a floor cleaning robot like the Roomba only needs to identify what’s an unobtrusive open floor and what’s not and move on. A shadow and highlight algorithm can help it perform this function seamlessly. It doesn’t need to know what’s a chair, table, or the details of the furniture or the object.

However, these same strategies cannot be applied to all sorts of functions. Like driverless cars or dog-walking robots. They need more refined tech and strategies. These strategies can come in terms of add-ons to the original base technology of a camera.


Robots with cameras and pixel/feature-based algorithms are a foundational level order. They are manual or labor-intensive and are essentially spoon feeding robots. Over time, this aspect doesn’t seem feasible. Here is where our previous examples like driverless cars come in.

These advanced robots use better vision systems developed by scientists over the years which do not program the robots to interpret objects. Instead of that, the scientists are teaching them how to see. Not only see but also perform a more evolutionary behavior and self-learn along the way. Thus a robot is not just a chassis for a camera but a self-evolving software.

One of the ways they do this is by is creating a software-based neuron network. The engineers, to state simply, enter the robot with data and structures of the functions but do not develop an algorithm. This is left to the robot’s own capacity.

The robot by performing the functions learns more and more and soon, becomes more evolved than a scientist could ever make it. For example, there are robots who are learning how to cook by watching YouTube videos on cooking. This also makes them evolve recipes. Thus the robot knew what cooking was, what vessels were, and what techniques were there, but how to perform a perfect function by coordinating all these aspects is something it learned on its own.

However, these types of robots are still in their evolutionary stages. At present, they are experimental forms but are a soon-to-be reality.


There are systems called hive minds where the neural networks of multiple robots are connected. Thus, if a new robot joins in, it doesn’t need to learn from scratch. It can just build on the information the robots before it have gathered.

The future of robotics is bright and it has never evolved faster. Soon, most systems will be automated or robotized. But whether this is a bane for humans or a boon is something only the future can show. So far, we’re in too deep to turn back, and maybe robotics is indeed a change for the better.



No more articles
Send this to a friend