Computer vision

Computer vision is a field of computer science that enables the computer to see, interpret and understand digital images in the same way that human vision does. It aims to processes and detect useful information in order to provide an appropriate output. Therefor it is closely linked with artificial intelligence as it is like imparting human intelligence and instincts to a computer.

Applications of Computer vision Smart ProductsEldery homesNurseryVisually impairedOther applications CarsCrowd dynamicsEmotion detection
ChallengingPreprocessingFeature detectorsFeature engineeringFeature learning

Applications of Computer vision

Smart Products

Computer vision and eldery homes

Elderly tend to forget or refuse wearing devices belonging to an emergency system (e.g. panic button). A vision based approach does not require any sensors to be worn by the elderly and is able to detect falls automatically. As falls are considered to be a major risk for elderly, there has been done
research on automatic fall detection. Not only the fall itself but also the consequences of a fall are a great risk for elderly. It has been shown that getting help quickly after a fall reduces the risk of death by over 80% and the risk of hospilization by 26%. Furthermore, not only falls can be detected but also other events where help is needed (e.g. fire, flooding,. . . ). Find more about it in this study.

Computer vision and babies

Nanit is a smart baby monitor in the room. Its advanced machine learning and computer vision technology give you a full picture of baby’s sleep directly from the Nanit app. Morning, noon and night, you’ll have access to easy-to-digest night and day sleep summaries, ins  ights and trends. Nanit even gets smarter the more it analyzes your baby’s sleep. It’s like having a sleep expert on hand 24/7. with Nanit, you can hear, see, track and understand your child’s sleep patterns. This means parents can see not only whether their baby is sleeping, but how well their baby is developing.
The device also has a humidity and temperature sensor, which enables you to monitor the ideal room environment for the baby.

Computer vision and visually impaired persons

The goal of Range-IT is the development of a standalone wearable assistive device that will extend the mobility of visually impaired people. Besides increased mobility, Range-IT aims to improve the social participation of visually impaired persons, in addition to decreasing what may be described as ‘unpleasant dependence’. The partners combine state of the art technology. Soft Kinetic and TU Dresden combine their efforts to provide the newest 3D imaging techniques for real time analysis of the environment. TNO and Elitac will build a tactile display that will guide and alarm the wearer using small vibrating motors positioned around the body. UGent takes the lead in the design process, that is centred around the human end user. Dräger & Lienert, a company selling custom assistive solutions to blind people, provides the much needed voice of the end user. By using the Range-IT system, users will be able to better navigate inside buildings that are unfamiliar, such as train stations. Additionally, it will also increase the range of obstacle detection from 1,5m (Cane) to 7-8 meters. This leads to more mobility when compared to the use of a white cane, while being significantly lower in cost than a guide dog. Being able to recognise objects such as stairs or doors will further help visually impaired users move around in unknown buildings. Range-IT users will be able to carry or wear the system wherever they go, offering users increased mobility.

Other Applications

Computer vision and cars

Cars can be equipped with computer vision which would be able to identify and distinguish objects on and around the road such as traffic lights, pedestrians, traffic signs and so on, and act accordingly. The intelligent device could provide inputs to the driver or even make the car stop if there is a sudden obstacle on the road.

Computer vision and crowd dynamics

During an event, crowd managers observe the crowd and identify potentially dangerous locations. In public space design, pedestrian behavior is analyzed in order to support planning of urban infrastructure, for instance, in the design of pedestrian facilities. Recently, in visual surveillance, methods for observing crowded scenes and detecting dangerous situations are being developed for supporting security personnel. Knowledge about human crowd behavior is then used in intelligent environments in order to control and direct streams of pedestrians.

Computer vision and emotion detection

Nowadays a computer could be trained to detect real versus faked facial expressions of pain significantly better than humans. Participants were shown video clips of the faces of people actually in pain (elicited by submerging their arms in icy water) and clips of people simulating pain (with their arms in warm water). The participants had to indicate for each clip whether the expression of pain was genuine or faked. Whilst human observers could not discriminate real expressions of pain from faked expression better than chance, a computer vision system that automatically measured facial movements and performed pattern recognition on those movements attained 85% accuracy. Even when the human participants practiced, accuracy only increased to 55%.


In reality this is much challenging than one might think. A human experiences vision in terms of colors, shadows, reflections, depth, shapes,… whereas a computer only experiences vision in terms of bits and bytes, or at a higher level, a grid of numbers. In a way, a visual scene or object has many irrelevant or ambiguous data. For example, face-detectors only care about the region(s) of an image where a face is located. Processing al other regions would be irrelevant and thus highly inefficient. And this is the part where feature detectors takes it role.


Image pre-processing is an integral part of the computer vision pipeline. Pre-processing entails cleaning up the image and making sure that it is ready to be fed into the image recognition pipeline. Several techniques are used in pre-processing, such as denoising, color enhancement, high dynamic range, artifact removal, image stabilization, and so on. For whatever reasons, computer vision technologies, such as object detection or deep learning algorithms, often make headlines, whereas image pre-processing has largely been ignored by the press. However, pre-processing is an equally important part of the overall process and deserves special attention, depending on the use case.
Some of the basis preprocessing technics are: Noise removal (i.e. median filtering), edge detection , corner detection, sharpening of images (i.e. high frequency emphasis) and adjusting the contrast (histogram egalisation). You can read more about it in the book Feature Extraction & image processing (M. Nixon and A. Aguado)

Feature detectors

A feature is a piece of information which is relevant for solving the computational task related to a certain application. Features may be specific structures in the image such as points, edges, shapes or objects.

Features can also change based on lighting, pose changes, perspective and so on. It is therefor important to figure out which features are relevant and which ones are not. The irrelevant features are normally too common and thus carries less information, in information theory it is known that rare occurrences carries the most information, thus the task of a computer vision system is to identify these infrequent anomalies with the most information either through machine learning (feature learning) or human engineering (feature engineering). In simple words features are nothing but the unique signatures of the given image that might be useful for prediction.

Feature engineering

Feature engineering is the human process of selecting features that are useful to predicting a model. The quality and quantity of features will have a great influence on whether the model is good or not. It is a process that could take some time and goes hand in hand with trial and error. The following steps are looped over and over in order to yield the desired result:

  1. Brainstorming Or Testing features;
  2. Deciding what features to create (feature selection);
  3. Creating features;
  4. Checking how the features work with your model;
  5. Improving your features if needed;
  6. Go back to brainstorming/creating more features until the work is done.

Traditional feature engineering often requires expensive human labor and often relies on expert knowledge. Also, they normally do not generalize well. This motivates the design of efficient feature learning techniques, to automate and generalize this: feature learning.

Feature learning

Basically feature learning is the automated way of feature engineering, obtained by machine learning. Machine learning will use a set of techniques that can both craft features as well as learn from those features. In a way, machine learning transcends the human ability to explore these features: Computers are way faster in calculating thousands of iterations in mere seconds, it can therefor learn and discover features that even experts could not be able to detect (hidden features). Feature learning can be divided in two categories: supervised learning (supervised dictionary learning, convolutional neural networks) and unsupervised learning (K-means clustering, unsupervised dictionary learning, local linear embedding, principal component analysis and independent component analysis).