Taste and color: how to teach robots to feel like humans

Ever since robots were invented, humans have been trying to make them as similar to themselves as possible. Being human means perceiving the world around us in a certain way, but our software contains codes that are hard to break. Imitating human perception with electronics has become one of the most challenging tasks. Today, it is also a major stumbling block in creating artificial intelligence systems and developing human-machine interfaces. They want more and more versatility from robots.

Sensing robots - second-generation robots

The first created generations of robots used in manufacturing did not have the technical means to receive information about the environment. They were devoid of vision and intelligence. Therefore, any object or person who got in their way could become a victim of a collision or an automatically programmed robot impact. Or the intelligent machine itself could suffer from such an interaction.

First-generation robots are not able to operate with unknown arbitrarily located and non-oriented objects and require additional devices for the particular organization of the working area. All of this makes automation more complicated and expensive, and less flexible. It is possible to avoid these unpleasant consequences of sensory and intellectual limitations of the first generation robots by significantly expanding the range and nature of information-measuring sensors and control programs. It is how second-generation robots - sentient robots - emerged.

These robots differ from first-generation robots by essentially more extensive assortment of artificial sensory organs. First of all, these are tactile, visual, sound, and some other sensory sensors. Second-generation robots' sense organs serve as a piece of input information on the robot and environment conditions into the control system, which is no longer limited to a memory and programming device, as in first-generation robots, but requires a controlling computer for its implementation. It is precisely the sensing combined with a sufficiently perfect and diverse software of the controlling computer that allows the second-generation robots to work with non-oriented objects of arbitrary shape, assemble and mount structures according to the drawing, interact with the external environment, perform the required (programmed) sequence of operations in a changing climate. Thus, sensing robots is a necessary prerequisite for increasing their functionality.

Information-measuring system of sentient robots, i.e., their sensory organs system, consists of sensory sensors of external and internal information. The correlation between the information sensors and their interaction in these robots differs significantly from software robots. It is important to note that for sentient robots, a significant role is played by external information sensors for perception, analysis, recognition, and control of environmental conditions. In particular, sensors used in first-generation robots can be used as internal information sensors.

Requirements and main characteristics of external information sensors

Depending on the purpose of a sentient robot, its external information sensors must simulate touch, vision, hearing, etc. In addition, there are sensors for measuring radioactivity, pressure, humidity, temperature, and other physical quantities. These sensors must have high accuracy, reliability, speed, small size, weight, and cost.

It's worth noting that sensors and gadgets that are many times more sensitive than our senses are now being created. A photocell sees much of the spectrum better than the eye, and a microphone hears better than the human ear. A seismograph is more sensitive than our sense of touch, and the feeling of temperature is not as good as a thermometer, of course.

And only one sense, the sense of smell, i.e., the detection and determination of small amounts of impurities of organic matter in humans and animals, is better complete than existing instruments. In this regard, it's worth emphasizing that the olfactory organs are among the most complicated senses, with the nature of the phenomenon on which they operate yet unknown. Therefore, "catching up with the dog's sense of smell" is one of the current problems of robot sensing.

Experience in studying human and animal sense organs contains a lot of information that can be used as prerequisites for developing artificial sense organs. In living systems, all sense organs are equipped with their own organs of motion, which, in turn, are richly furnished with kinesthetic receptors. During perception, an essential role belongs both to individual receptors and to receptive fields and local detectors, making it possible to distinguish certain elementary features of objects. When analyzing the environment and internal state, joint coordinated processing of sensory signals of different types plays an important role, taking into account the actions performed.

Human interaction with the external environment is primarily based on visual, audio, and tactile-kinesthetic information processing. There are also situations when only tactile and kinesthetic sensations can give correct information about the characteristics of the environment. These situations arise, for example, when it is necessary to make micro-movements of the fingers to determine the shape and surface quality of the surrounding objects and cases when there are obstacles to visual control.

The main types of "artificial senses" – sensors

Tactile and kinesthetic sensors

The creation of sensors with tactile and kinesthetic sensitivity was required to solve several challenges involved with searching for items, grabbing, and moving them. A contact sensor is the most basic sort of such sensor. They're tiny switches that detect when an object comes into touch with them.

Tactile sensors respond to touch and detect pressure where the sensor joins (contacts) an object. They're frequently attached to transport robots' bumpers or manipulation robots' grips. These sensors are used to detect individual objects, prevent damage to them and the robot itself, and recognize the external environment by touching and groping.

Kinaesthetic sensors record the position and movement of actuators (e.g., the fingers of a manipulator's grip) and the forces in them.

An essential feature of tactile and kinesthetic sensors is working in almost any environment. In particular, these sensors are indispensable for underwater robots because the television or optical feedback channel stops working when the water is turbid.

Visual sensors

For intuitive perception and analysis of three-dimensional scenes, special equipment is needed, which in essence, must imitate the functional work of the eye. It must provide solutions to such problems as active search for objects by changing the orientation of the visual sensor, automatic focusing of the image, measuring the distance to objects, adjusting the sensor sensitivity depending on changes in lighting conditions, highlighting image features (color, texture, contours, size, shape, etc.).

In visual robot sensing, television and optical sensors are the sources of information. A television sensor ("television eye") is a television camera. The whole image or a fragment is recorded in memory as a two-dimensional matrix projection of a real three-dimensional scene. However, the television image is flat, unlike the objects themselves, with three dimensions. It deprives man and robot of three-dimensional perception and the associated "presence effect." Therefore, of great importance for robots' perception are the means of holography, which allow recording and reconstructing not the two-dimensional dimension but the light wave emanating from an object, with all its details. Robots use photocells, photodiodes, light filters, light guides, and other elements together with light sources to determine the color of objects.

The main disadvantage of visual sensors is their unsuitability in the absence of light sources or conditions with solid light scattering or absorption, such as underwater or in outer space.

Sound sensors

Sound sensors include all kinds of microphones and ultrasonic sensors. Microphones are used to pick up proper commands when controlling a robot by voice. Ultrasonic sensors consist of a signal transmitter and a signal receiver. They can use the ultrasonic signal reflected off objects to detect them and determine their distance.

Compared with optical sensors, ultrasonic sensors have the following advantages: they can detect transparent objects; their readings do not depend on lighting conditions and are very insensitive to changes in the physical properties of the environment (dust, steam, liquid medium); a life of oscillation generators is almost unlimited, etc. However, due to the fuzzy directionality of ultrasonic vibrations, the accuracy of determining distances to objects in such sensors is low. In addition, they cannot detect objects with tiny dimensions due to the relatively long length of ultrasonic waves.

How human senses differ from the sensory systems of modern robots

In a sense, robots are trying to copy the feelings that humans or animals have. But robots often have much more advanced systems. For example, the human vestibular apparatus registers body or head position changes. No organ, however, can tell us how many angular minutes our knee or elbow is bent or how far an item is from us to the nearest micron. Humans don't need it, but in the current paradigm of robotics development, this information can help intelligent machines.

Robots can be divided into two types: locomotion and manipulation. The former's main task is to move a payload, or a person over significant distances, like drones, unmanned cars, or boats do. In this situation, the primary job of sensors is to detect the robot's position in space as well as its placement in relation to nearby objects. We can also add linear and angular acceleration sensors, which provide a sense of balance, i.e., orientation in the gravitational field.

The task of manipulation robots, which should functionally mimic hands, is to perform various operations with objects. It is where kinesthetic sensing comes to the fore, which provides a sense of position, movement, and force. In other words, we need sensors that can determine the current configuration and speeds of individual parts of the robot, as well as tactile and pressure sensors. The latter are especially in demand to ensure reliable grasping of manipulated objects, as well as to control the forces of interaction with things, the environment, and humans, in order, for example, to perform a quality contact operation and not to damage the robot or injure a person who is nearby or in direct contact.

Of course, for all the robots listed above, a considerable number of auxiliary "service" sensors can be used, which depends on the specific application. Some of them provide information about the system's internal state, and some give information on the environment. It's worth highlighting sensors for human-robot interactions are crucial in this context.

Trends in Robot Sensorics

One of the current trends in robot sensing is the development of tactile sensors. Progress is moving towards creating robots that can work efficiently and safely in a dynamic, unstructured environment. It is impossible to put everything strictly in place once and for all and close contact with humans.

In this regard, more and more new types of meters are emerging. For example, distributed sensors called artificial skin are being developed and tested. The direction of haptics - force feedback control - is powerfully growing in the USA, Switzerland, Germany, Korea, China, and Japan, where large laboratories are working in this area.

The concept of integrated functional design, or co-design, of all robot components, is also evolving in the design of robotic systems, where the design, sensors, power supplies, computing platforms, algorithmic and software are developed simultaneously, based on the final functionality of the system as a whole.

Due to personal robotics development, sensors for multimodal human interaction, including, for example, combined sensors for simultaneous reading of audio and visual information for further processing of natural language (natural language processing), should become more widely used.

Sensing robots: what a machine with sight and hearing can do

Some intelligent machines have gas and liquid analyzers as an analog of our organs of smell and taste, with which they can detect smells and tastes, even if not to the same extent as we can. For decades, such "electronic noses" have helped customs officials see dangerous and prohibited substances. Laboratories use them to analyze foodstuffs for freshness and undocumented impurities. They help doctors diagnose gastritis's early stages and detect explosive gas leaks in pipelines.

Speaking of hearing, as an example, the Siri voice assistants for iOS or others. - These are examples of "electronic ears" that perceive your speech and "understand" what you said. Recognition of natural speech is a challenging task, so creating such systems involves knowledge from different fields of linguistics, mathematics, and computer science.

As an analog of human vision (two eyes and those biomechanisms formed in our brain from birth), we can cite an entire branch of computer science - computer (or machine) vision. Computer vision systems solve all tasks attributed to visible reality and sometimes go beyond it if we talk about vision in the infrared range, electron microscopes, and radiographs. Such systems recognize handwriting input and authorize access to protected systems through the retina or iris. Allow subtle signs of abnormalities on radiographs. Provide data entry via eye-movement tracking for people with disabilities.

Suppose there are several images of an object taken with some different parameters. In that case, it can be the angle of rotation, lighting, or something else; the task of AI systems is to reconstruct a three-dimensional scene, a model of the object taken. It is the kind of task we do every day. For example, we look somewhere and estimate the distance to things and their shape and notice protruding parts. It is what our brain does in the background.

Modern computer vision algorithms are fast enough to obtain three-dimensional data from images, even from two cameras, fractions of a second. In this case, we can talk about real-time stereo imaging algorithms. Such a robot equipped with a camera or cameras can route itself in real-time in some unknown space and build a model of this space and send it to a human.

The practical application of such systems seems obvious. Suppose this robot is a bathyscaphe, a quadcopter, a wheeled platform, or a walking apparatus. In that case, it can be sent to a hard-to-reach place, such as a radiation-contaminated area, even the Moon or the ocean floor, and it will build a model of the environment and find its way without human intervention.

Doctors also use computer eyes to analyze medical images, entomologists to identify different types of arthropods, and even chemists use computer vision to record the results of experiments.

For example, dentists can use a special camera to take pictures of chipped teeth from different sides and build a three-dimensional model of the chipped tooth. The patient can then see a model of this breakage and a model of the natural tooth on the monitor. The program can combine them, a special algorithm subtracts the volumes, and a three-dimensional model is produced for fabrication. This model can be printed as a file, the machine makes the missing part of the tooth, and it fits perfectly into its intended location.

There is also an example of a diagnostic application. If a doctor has MRI slices of the brain or other organs, a three-dimensional picture can be obtained from them, which will make it clear what the volume of a particular process in the patient's organs is and how widespread it is.

At the moment, three-dimensional reconstruction technologies are often used for navigation and cartography. There are autonomous modules, including cars, equipped with many sensors, including radars, lidars, sonars, and cameras. The task of all these devices is to get a complete picture of what is around the car. With their help, the vehicle can perform parking and automatic movement along the lines of the markings. These technologies are already being implemented and used either in automated driving or autonomous driving (in driverless cars). If you go higher up in the air, their computer vision systems solve the problems of aerospace mapping. On similar principles, using images in the visible, infrared, and ultraviolet ranges, computer vision systems capture the earth's topography, over which satellites fly.

High-precision non-destructive scanning has been found in cinematography, computer animation, and medical applications. Creating models of the most realistic objects - faces of people, animals, etc. - is almost impossible without fast reconstruction technologies. Tools of high-precision transfer of natural things into virtual scenes are based on algorithms with structured illumination.

Some companies specialize in technologies for cinematography and animation. Their solutions allow a projector and a high-resolution camera to scan human faces. The resulting models will enable the creation of realistic characters for animation and replace real actors in cinema. Other companies specialize in creating interactive holograms of macroscopic objects (neighborhoods, power plants, business centers) - they also have to resort to computer vision to take the dimensions of existing things.


Thus, robots these days are more "human" than ever before. Some predict that robots will be able to sense everything around them very soon. But machines still have a very long way to go. As a result, it should improve their "behavioral autonomy," making it easier for robots to interact with humans and solve everyday tasks and conflicts that arise in the process.