Archive for Artificial Intelligence

Reviewing a book about ROS programming

// November 7th, 2013 // 1 Comment » // Artificial Intelligence

In the modern robotics community, there has been a necessity for a book that explains the ins and outs of the Robotic Operating System (ROS).

ROS is the most popular robotic framework used in the world. Created by Willow Garage, it provides a framework for easily communicate processes, standarized access to robotic resources or clear visualisation of robot data. Until now, your only source of material for learning ROS was the excellent, but sometimes confusing, documentation provided by Willow Garage.

Now, Enrique Fernández and Aaron Martinez have filled the gap with their book Learning ROS for Robotics Programming


The book covers all the basic aspects required to understand ROS, how to install it, and how to use it with your own robot.

First chapter describes how to install two different versions of ROS (Electric and Fuerte), including the case of setting up a virtual machine to work with ROS.
In the second chapter core concepts of ROS are explained: topics, nodes, stacks, packages, services, etc… they are not intuitive at all, but the book provides a clear explanation.

Next chapters includes debugging with ROS, how to use sensors and actuators, and how to simulate your own robot with Gazebo, the default simulator for ROS.

Then, several chapters are dedicated to bring some intelligence to your robot using out of the box solutions included in ROS. How to make a robot navigate and how to make a robot use visual information to define its behaviour.

The book ends with a chapter dedicated to practice all the stuff learned with actual complex robots (even if only in simulations). The last chapter is dedicated to use the freely available simulations of world wide robots to practice about ROS. How ROS has been implemented in those robots and how you can use ROS programming to make those robots perform some useful activity, all of it in the Gazebo simulation environment.


The book is very well structured. It builds from the most simple things to the more complex in a smooth enough progressive complexity. It is full of exercises that guide you step by step in all the concepts. You can definitely use the book to learn the ROS.

There are three key parts of ROS that are correctly described in the book:

  • The content of chapter 3 that describes how to debug ROS code. This is one of the more difficult subjects when learning ROS, may be because it is not correctly described in the official documentation. However, the book correctly achieves to teach the subject.
  • The use of actuators and sensors that can are of common use in robotics are described in great detail.
  • The creation of a simulation for your own robot, is another of the confusing subjects that the book treats very well.
  • I can point two drawbacks:

  • Indicate that the book is focused on ROS Fuerte version, which is a little bit outdated (three versions more have appeared after Fuerte). However, this point is of low importance because the core of ROS concepts do not change from version to version. It can only affect the last chapter of the book.
  • The book is not deep enough to be used as a reference book. However, I believe that the goal of the authors was not that one, but to teach for new comers. If that was the case, the goal is perfectly achieved.
  • My recommendation is that, if you are new to the world of ROS, this book is a must. It will definitely speed up your learning of the matter.

    The essence of A.I.

    // January 12th, 2013 // No Comments » // Artificial Intelligence

    The questions was launched by Menno Mafait at the LinkedIn group Applied Artificial Intelligence: What is the essence of artificial intelligence?


    Answers to that question basically addressed which A.I. technique would actually be more suitable to generate an intelligent machine. The techniques discussed ranged from logic based approaches to genetic algorithms, moving through artificial neural networks, bayesian nets, probabilistic approaches, etc. The best technique would then be considered the essence of A.I.

    My answer to the question was the following:

    Of course, after having provided such answer, the next two logical questions asked there were:
    1. How do you define understanding?
    2. How can you build a machine that understands?

    I don’t know how to define understanding.
    I know, though, when somebody (or something) understands or not. It is true that understanding can only be observed from the outside of the person/thing, but this does not imply that if I build a system that shows in the outside like if it understands, then it means that it has understood. Or even worst, that real understanding is not necessary since the system without understanding actually presents the same thing as the system with it.


    I do really believe that understanding is an evolutionary advantage for creatures with limited resources. When the system to behave in the world has a (very) limited amount of computational resources (like ourselves), understanding makes the system more efficient and robust, and it works better with less computational resources.
    For a system with unlimited resources, understanding may be not necessary, since it can compute/observe/store the solution for EVERY state in the Universe. But understanding releases us from having to have such big storage system. It help us to compress reality into chunks that we can use to solve the problems of life.

    And that is exactly the role of science, to make us better understand the world to be able to live better in it.

    American-decline-Chomskypeter  2937

    Recently, there has been a long discussion between two great scientists: Noam Chomsky and Peter Norvig. Current A.I. is dominated by statistical analysis of data in order to find patters and build large tables that contain the answer to each possible situation in a narrow domain. Those techniques are extremely efficient and work quite well (in narrow domains, even if they increase their domains every year).

    Chomsky argues that that kind of A.I. is useless because it does not provide knowledge to humanity, it just provides answers (or as Pablo Picasso said, Computers are useless, they only provide answers!). Instead, Norvig indicates the large list of successes where those systems have been useful in making humanity progress. The discussion goes on by Chomsky indicating that A.I. is then just an engineered tool but not real a scientific paradigm because it does not bring knowledge to the world. Norvig replies that real knowledge has been provided because they are able to predict what the next word in a sentence is the most likely (for example).


    For me, the vision of Norvig is like the current status of quantum mechanics: just a list of recipes to know what to do when such and such, but nobody has an idea about why the recipes do work that way.
    This means, there is no real understanding of how quantum mechanics works. We just apply the recipe and the result is the predicted by it. But no real understanding of why.

    In my opinion, that is what Chomsky was trying to say: even if it works, current A.I has no understanding at all.

    Understanding is difficult because it cannot be analyzed or observed at the outside. And that is a great problem because it allows cheaters to use all their weaponry in order to make us believe that there is understanding behind a system that doesn’t have, just because it looks like it actually has. Of course, it is easier to try to cheat making a system look like it is something, than trying to make that something real (all of us know that situation when we tried to cheat at exams copying the answer of our colleague, instead of studying and learning ourselves).
    And this line of cheating has become main stream in A.I., making us believe that there is no actual difference between cheating and not cheating, i.e. it does not matter whether the system looks like it understands or if it has real understanding (after all, nobody can define understanding).
    I believe that such difference is what all is about. It is the ESSENCE OF INTELLIGENCE. Trying to avoid it through easier paths is just avoiding the real thing.

    So now the most interesting part, how understanding is built on a system?
    I can provide a brief answer without any real demonstration, based on my own theories. The theory is that understanding is built upon basic knowledge of the world, and scaled up to generate more complex understandings (like maths, philosophy, empathy, etc).

    Small chunks of understanding are created at a first stage, by direct interaction with the world. Interaction with the world provides the basic understanding units for a system. This means, the system learns what up and down mean (and it is not something like the things about my head are up and the things below my chest are down . That is just a definition for a dictionary. That is what an A.I. would do (define a threshold where below that threshold is down and above the threshold is up. Tune the threshold with real experiments).


    Meaning is embedded in the sensorimotor law of the system, it means how the reading of the different sensors the system has, vary with the actions taken by the system. Those sensorimotor laws are the basic understanding units. It’s not a number, it’s a law learned by experiencing with the world.

    Then, when an understanding unit is ready and well formed, it is suitable to be used for the generation of superior levels of understanding by using a metaphor engine (or a analogies, whichever is better, both are almost the same). This means that the basic law (understanding unit) is used to reason about other things that are completely different but that the law applies to them. By doing this, new laws are generated, this time not sensorimotor based anymore (only in their roots).

    metaphors-we-live-bywhere math

    And I am sad that I cannot provide more because I don’t know more.
    Hopefully, some people we are still working on this line of research and eventually, we will have some results that may be shown for better UNDERSTANDING.


    Related bibliography:
    [1] Metaphores we live by,Lakoff and Johnson
    [2] Where mathematics come from, Lakoff and Nuñez
    [3] Perception of the structure of the physical world using multimodal sensors and effectors, Philipona and O’Regan
    [4] Artificial Intelligence, a modern approach, Rusell and Norvig
    [5] Understanding Intelligence, Pfeifer and Scheier
    [6] A sensorimotor account of vision and visual consciousness, Noe and O’Regan
    [7] Why red doesn’t sound like a bell, O’Regan
    [8] Action in perception, Noe

    A 10 Minutes Introduction to Embodied Cognition

    // December 12th, 2012 // No Comments » // Artificial Intelligence

    What is cognition
    Basically, it is a group of mental processes. 
    Cognition requires: 
    1. Perception 
    2. Attention 
    3. Anticipation 
    4. Reasoning 
    5. Learning 
    6. Inner Speech 
    7. Imagination 
    8. Memory 
    9. Emotions 
    10. Planning 
    11. Pain and Pleasure 

     Most cognitive scientists view cognition as something that is computational

    Cognition = Computational System 

    By computational system they mean a system that manipulates symbols, in the sense that symbols without meaning are manipulated by means of the application of rules, generating by hence new symbols and conclusions. 

    How a living system gets the symbols? 
    A sensor gathers meaningful data. This data is converted into symbols. Then the brain uses the symbols to generate a (symbolic) response. The response is translated into meaningful action data that is executed by the actuators of the living system. 

    This explanation is perfect for artificially intelligent systems developers because it indicates that the brain doesn’t need a body. Then the scientists can concentrate on generating intelligence in any physical system that allows manipulation of symbols, and forget about the hardware. 

    There is though a small problem. How do the symbols acquire and release their meaning? The process by which meaningful data is translated into symbols, and symbols are translated back into meaningful data is called reification and up to date, nobody knows how it is performed (it is called the grounding problem) … at least in a non-embodied cognitive framework 

    Let’s make then an hypothesis: 
    meaning arises from the nature of the body 

    What this means? 
    • Living things generate a basic set of meaningful concepts based on their interaction with the world 
    • More complex concepts can be generated by applying metaphores over previous concepts. 

    This approach requires of a body to generate intelligence. That is the approach of the embodied cognition paradigm, and  has the following implications:

    • Conceptualization: the properties of an organism body constrain the concepts it can acquire (this has big implications for making artificial systems understand) 
    • Replacement: interaction with the environment replaces the need for representations (this has big implications in the number of resources needed to create an A.I.) 
    • Constitution: body is constitutive of cognitive processes rather than causal (this has big implications in how perception should be made in artificial systems)

    More Cognition, Less CPU

    // October 29th, 2012 // No Comments » // Artificial Intelligence, Talks

    The following post is a straight transcription of the speech with the same title I gave during the Robobusiness 2012 in Pittsburgh. You can find other details of the speech in this link.
    You can use the text and images at your will but give credit to the author.


    What is preventing us from having humanoid robots like these at home?

    What is preventing us from having one at home?

    What is preventing us from selling those robots?

    We are still years away from those robots.
    And there are many reasons for that, being one of them that robots are not intelligent enough. Some researchers say that the reason of this lack of intelligence is just a lack of CPU power and good algorithms. They believe that big supercomputers will allow us to run all the complex algorithms required to make a robot safely move through a crowd, make a robot recognize us from other people, or understand speech commands. For those researchers, it is just a matter of CPU which will be solved by year 2030 when CPU power will reach that of a human brain (according to some predictions).

    But I do not agree.

    This is a gnat.

    A gnat is an insect that doesn’t even has a brain just a few nervous cells distributed along the body. Its CPU power (if we can measure that for an animal) is very small. However, the gnat can live in the wild, flying, finding food, finding mates, avoiding dangerous situations and finding the proper place for their siblings…and all of this along four months of life. Up to date there are no robots able to do what a gnat can do with the same computational resources.

    I believe that considering artificial intelligence just a matter of CPU is a brute force solution to the problem of intelligence and discards an important part of it.

    I call the CPU approach the easy path to A.I.

    I believe there is another approach to artificial intelligence, one where CPU has its relevance, but is not the core. One where implementing cognitive skills is the key concept.

    I call this approach the hard approach to A.I. I think that is the approach that is required in order to build those robots that we’d love to have.

    In this post I want to show you:
    – What is what I call the easy path to A.I
    – Why do I believe this path will eventually fail in bringing the A.I required for service robots
    – What is what I call the hard path to A.I. and why it is needed

    The easy approach to A.I.

    So, what is what I call the easy path to A.I.?

    Imagine that you have a large database at your service to store all the data you compute about a subject, in the following way: if this happens then do that, if this else happens then do that.
    There will be, though, some data that you cannot know for sure because there are some uncertainties in the information you have, or require very complex calculations.
    For those cases you calculate probabilities of happening in the sense, if something like this happens then is very likely that this is the best option.
    Then you take decisions based on those tables and probabilities. If you apply this method to your daily life, you will decide if what you are looking at is a book or is an apple, based on the data in the table or the probability. If you could compute the table and probabilities for all your world around you can take the best decision at any moment (actually the table is taking the decision for you, you are just following it!)

    That is exactly the approach followed in computer board games, one of the first subjects where artificial intelligence was applied. For example in tic-tac-toe

    You all know tic-tac-toe. You may also know that there exists the complete solution for the game. I mean, there is a table that indicates which one is the best movement for each configuration in the board.

    When the table of best movements for each combination is discovered/computed, it is said that the game has been solved. Tic-tac-toe is a solved game since the very beginning of AI, because of its simplicity. Another game that has been solved is checkers. The full table was calculated in 2007.

    There are however some games that have not been solved yet, like for example go…

    … or chess.

    Those games have a lot more possible combinations of board configuration. They have so many that cannot be computed with the best supercomputers of the today. For instance, the game of chess has as many possible combinations as:

    – 16+16 = 32 chess pieces and 64 fields, so 64!32! ≈ 4.8·10^53 combinations

    Due to that huge number of possible combinations it is impossible (up to date) to build the complete table for all chess board combinations. There is not enough CPU power to compute such tables. However, complete tables already exist for board with any combination of 6 or less pieces.

    So a computer playing chess can use the tables to know the best move when there are only 6 pieces on the board. What does the computer do when it is not in any of those situations of 6 pieces or less?. It builds another kind of table… a probabilistic one. It calculates probabilities based on some cost functions. In those cases, the machine doesn’t know which one is the best movement at any moment, but has a probability of which one is the best. The probabilities are built based on the knowledge of a human.

    Using this approach it has been possible for a computer to win the best human player in the world.

    However, for me this approach makes no sense if what we are talking is about a system that knows what is doing and uses this knowledge to perform better.

    You can tell me that who cares about it, if at the end they do correctly the job (and even better than us!). And you may be right, for only this particular example of chess.

    The table approach is like if you provide to a student the table with all the answers for all the exams that he will have to study. Of course he will do it perfectly! Any person would do it. But at the end, the student knows nothing and understands less.

    Why more CPU (the easy approach) may not be the solution

    But anyway, the CPU approach is exactly what most A.I. scientist think about when talking about constructing intelligent systems.

    Given the success obtained with board games, they started to apply the same methodology to many other A.I. problems like speech recognition or object recognition. Hence the problem of constructing an AI has become then a race for resources that allow bigger tables and more calculation, attacking the problem in two front lines: in one side, developing algorithms that can construct those tables in a more efficient way (statistic methods are winning at present). In other side, constructing more and more powerful computers (or using networks of them) to provide more CPU power for those algorithms. This methodology is exactly what I call the easy AI.

    But do not misunderstand me, even if I call it the easy approach, it is not easy at all. This approach is taking some of the best minds in the world to solve those problems. But I call like this because is a kind of brute force approach, and because it has a clear and well defined goal.

    And because it is successful, this approach is being used in many real systems.
    The easy AI solution is the approach used for example in the Siri voice recognition system.

    Or the approach of the Google goggles (object recognizer system).

    Both cases use the power of a network of computers connected together to calculate what has been said or what has been shown to recognize.

    And the idea of using networks of computers is so successful, that a new trending following those ideas has appeared for robots called cloud robotics, where robots are connected to the cloud and have the CPU of the whole cloud for its calculations (among other advantages, like shared information between robots, extensive databases, etc…)

    That is exactly how driverless cars of Google can be driven, by using the cloud as a computer resource.

    And that is why cloud robotics is being seen as a holy grial. A lot of expectations are put in this CPU capacity, and it looks (again) like A.I. is just a matter of having enough resources for complex algorithms.

    I don’t think so.

    I don’t think that the same approach that works for chess will work for a robot that has to handle real life situations. The table approach worked with chess because the world of chess is a very limited world compared with what we have to live when handling speech recognition, face recognition or object recognition.

    Google cars do not drive at the high way during rush hour (up to my knowledge).

    Google goggles makes as many mistakes as correct detections when yo use it in a real live situation.

    And Siri doesn’t handle the problem of speech recognition in real life because it uses a mic close to your mouth.

    Last February I was at the euCognition meeting in Oxford. There I attended to a talk given by professor Roger Moore, an expert in speech recognition systems.

    In his talk, professor Moore suggested that in the last years, even if some improvement was made due to increase in CPU power, speech recognition seems to have reached a plateau of error rate, between a 20% and a 50%. This means, even if CPU power has been increasing along the years, and the algorithms for speech recognition been made more efficient, no significant improvement has been obtained, and worst of all, some researchers are starting to think that some speech problems will never be solved.
    After all, those speech algorithms are only following their tables and statistics, and leave all the meaning outside of the equation. They do not understand what they are doing, hearing or viewing!.
    He ended his talk indicating that a change in the paradigm may be required.

    Of course, professor Moore was talking about including more cognitive abilities in those systems. And from all cognitive abilities I suggest that the key one is understanding.

    Understanding as a solution, the hard approach

    What do I mean by understanding? That is a very difficult question that I can’t answer straight. What I can do is to show what I mean by using a couple of examples.

    This is a real case that was proposed to Deep Thought, the machine that almost won Gary Kasparov in the 90′s (from Williamd Harston, J. Seymore& D. Norwood, at New Scientist, n.1889, 1993). In this situation, the computer pays white and has to move. When this situation was presented to Deep Thought, it took the tower. After that, the computer lost the game due to its inferior number of pieces.

    When this situation is presented to an average human player, it clearly recognizes the value of the paw barrier. It is the only protection he has against the superior number of pieces of black. The human and avoids breaking it, leading the match to a draw. The person understands its value, but the computer not.

    Of course you can program the machine to learn o recognize the pattern of the barrier. That doesn’t mean that the computer has grasped the meaning of the barrier, but that it has a new set of data in its table for which it has a better answer. But the demonstration that the machine doesn’t understand anything is given when you present to it the next situation.

    In this situation, again computer plays white. There is no barrier, but white can generate one by moving the bishop and constructing one. When the situation is presented to a computer chess program, it takes the Tower. (from a Turing test by William Harston and David Norwood)

    What this situation is showing is that the machine has no understanding at all. It cannot grasp the meaning of a barrier, unless you specify in the code all the conditions for it. But even if you achieve to encode the conditions, there will be no REAL understanding at all. Some variations of the same concept are not grasped by the machine, because it doens’t have any concept at all.

    The only solution to that problem that the easy approach is proposing is to have enough computational power to encode all the situations. Of course, those situations have to be previously detected by somebody in order to make available the information to the machine.

    The only problem I can see with that approach when applied to real life situations like in a service robot, is that it may be not possible to achieve such comprehension having understanding out of the picture… mainly because you will never have enough resources to deal with reality.

    Another example of what I mean by understanding.
    Imagine that we ask our artificial system to find a natural number that is not the sum of three squared numbers.

    How would easy AI solve the problem?. It would start checking all the number starting from zero and increasing up.

    0 = 0^2 + 0^2 + 0^2
    1 = 0^2 + 0^2 + 1^2
    2 = 0^2 + 1^2 + 1^2

    7 ! = 0^2 + 0^2 + 0^2

    7 ! = 1^2 + 1^2 + 2^2 = 6
    7 ! = 1^2 + 2^2 + 2^2 = 9 ← here it is! Seven is the number!

    For this example, the AI found a proof that there is a number that is not the sum of 3 squared numbers. Easy and just a few resources used. As you can see, this is a brute force approach, but it works.

    Now imagine that we want the same system to find a number that is not the sum of 4 squared numbers.

    The easy AI would follow the same approach, but now it would require more resources. After having checked the first 2 million numbers are the sum of 4 squared numbers, you may start thinking that more resources are going to be needed in order to demonstrate it. You can add faster computers and better algorithms to compute the additions and squares, but the A.I. would never find it because it doesn’t exist.

    There is no such a number!

    How do I know that it doesn’t exist?

    Because there is a theorem by Lagrange that demonstrates just that.
    The human approach to solve the problem is different. We try to understand the problem, and based on this understanding find a proof instead of trying every single natural numb. That is what Lagrange did. And he did not required all the resources of the Universe!

    And that is my definition of understanding and I cannot put it into better words.

    Now the next question is, if I say that understanding is what is missing, how can we include it in our robots. Also how can we measure that the system has understanding?

    Provided that there is no clear definition of what exactly is understanding, we know less how to embed it into a machine That is why I call this the hard approach to A.I. Hence I can only provide you with my own ideas about it.

    I would say that a system has understanding about a subject when it is able to predict about the subject. It is able to predict how the subject would change if some input parameters change
    I think that you understand something when you are able to predict about that something plus you are aware that you can predict it. I cannot tell you more.

    How can we test if a machine has understanding? This is also not clear. At present the standard test to discern an artificial machine is the Turing test that you all know. This means, if the machine can do the job at least as well as a person, then it is OK for us.

    However this test can be fooled in the same way as the teacher was fooled by the student that knew the answers provided by someone.

    And the reason is that the Turing is focusing in one single part of intelligence: the what part. From my point of view, intelligence has to be divided into two parts: the what and the how

    intelligence = what + how

    The what indicates what the system is able to do, for example playing chess, speak a language or grasp a ball. The how indicates how, in which way, the systems performs such thing and how many resources uses. Either looking at a table, calculating the probabilities or just reasoning about meanings.

    Examples of a system with a lot of what but a few of how: the chess player, or the student at the test exam. Example of high how but low what: the gnat’s life.

    The problem with current artificial intelligence is that it is only concentrated in the what part. Why? Because it is easier and provides quicker results. But also because is the one that can be measured with a Turing test.

    But the how part is as important as the what. However, we have no clue about how to measure it in a system. An idea would be to use similar experiments as the ones used by psychologists. However this would only allow to measure systems up to a human level and not beyond or different (because we cannot even mentally conceive them).

    To conclude,
    I think that at some point in our quest for artificial intelligence we got confused about what intelligence is. Of course natural intelligence uses tables, and also calculates probabilities in order to be efficient. But it also uses understanding, something that we cannot define very well, and that we cannot measure.

    Time and resources need to be dedicated to study the problem of understanding, and not just pass by it as has happened up to now.

    Intelligence Is Also About How We Do Things

    // August 24th, 2012 // No Comments » // Artificial Intelligence

    Since the creation of the artificial intelligence field, an AI has been judged for what it is able to do. Programs that can follow a conversation, that can predict where there is oil underground, that can drive autonomously a car… all of them are judged intelligent only based on their functional behavior.

    To achieve the desired functionality, all types of tricks have been (and are) used and accepted by the AI community (and in most of the cases, they were not qualified as tricks): use large data structures that cover most of the search space, reduction of the set of words the AI has to recognize, or even asking the answer to a human through internet.

    The Turing Test: C cannot see neither A nor B. A and B both claim that they are humans. Can C discover that A is an AI?

    The intelligence of those systems could be measured by using the Turing test (adapted to each particular AI application): if a human cannot distinguish the machine from a person performing the same job, then it would be said that the machine is a successful AI. What this means is, if it does the job, then it is intelligent. And this kind of measure has lead to the kind of AI that we have today, the one that can win at chess to the best human player, but that it is not able to recognize a mug due to small changes in the light of the room.

    But not everybody agrees on that definition of AI. For instance, the Chinese room argument exposed by Searle, criticizes that such AI shows no real intelligence and never will if based on that paradigm.

    In the Chinese room experiment, a man inside a room answers questions in Chinese by following the instructions provided on a book, without understanding a piece of Chinese

    I do agree with Searle’s argument. I think that the problem here is that we are missing one important component of intelligence. From my point of view, an intelligent behavior can be divided into two components: what is done by the behavior and how the behavior is done

    intelligence = what + how

    The what: the intelligent behavior that one is able to do
    The how: the way this intelligent behavior is achieved.

    Turing test only measures one part of the equation: the what. 99.99% of AI programs today are based on providing a lot of weight to the what part of the equation. They think that, if enough weight is provided to that part, nobody will notice the difference (in terms of intelligence).

    The reason for providing weight to the what part is because it is easier to implement, and furthermore we can measure it (that is, we can use the Turing test for the what measure). Since there is no way of measuring the how, and nobody has a clue about how humans actually do intelligent things, people just prefers to concentrate on the part that provides results in the short term: the what part. After all, the equation can have a large value by working on any of the two constituents…

    However, I think that a real intelligent behavior requires of weights in both constituents of that equation. Otherwise we obtain unbalanced creatures that are far away from what we as humans are able to do in terms of intelligent behavior.

    Here there are two examples of unvalanced intelligences:

  • The case of an intelligent behavior with only score on the what: this is the case of a guy who has to do an exam about quantum mechanics. He has not studied at all so he doesn’t know anything about the subject. He has, though, a cheat sheet (provided by the secretary of the teacher) that allows him to answer correctly all the exam questions. After having evaluated the exam of the guy, the teacher would say that he has mastered the subject. His knowledge about the subject is only being judged by what he has done (to answer the exam). We would say that he is very intelligent in the field of quantum mechanics, but actually, by observing how he answered the exam questions, we can see that he has no knowledge at all. He looks intelligent, but he is not.
  • The case of an intelligent behavior with most of its value on the how: this would be the case of the animals, any of them, ranging from the smallest ones to the closest to us in terms of intelligence. Animals do have a lot of how intelligence, related to the tasks that they are able to do, and they are not able to do so many things as us because their score in the what is lower.
  • Animals have low intelligence in the ‘what’ part but quite a lot in the ‘how’

    Now, the question is how can we measure the how part?
    That is a difficult matter. Actually, we do not have any kind of reliable measure for that even for humans. I would propose, that the how can be measured by measuring understanding. We decide how to do something based on our understanding of that thing and all the things related to it. When we understand something we are able to use/perform/communicate that something in different situations and contexts. It is our understanding the one what drives how we do things.

    In this sense, sociologists have created experiments with infants that try to figure out what they understand and until which point they do understand [1][2]. Based on that, a scale based on the different stages of an infant development could be created and applied to AIs to measure their understanding. I would take human development as the metric for this scale, starting for zero equivalent to the understanding of a new born child, and ranging to 10 for the understanding of an adult. Then, the same tests can be applied to the machine in order to know its level of intelligence in the how part.

    As a conclusion, I believe that intelligence is not about looking at a table and reading the correct answer (like in the case of chess, Go or the guy who cheats at the exam). Intelligence involves finding solutions with a limited amount of resources in a very wide range of different situations. This stresses the importance of how things are done in order to perform an intelligent behavior.

    [1] Jean Piaget, The origins of intelligence in children, International University Press, 1957
    [2] George Lakoff and Rafael Nuñez, Where mathematics comes from, Basic Books, 2000

    To calibrate or not to calibrate…

    // March 24th, 2012 // 1 Comment » // Artificial Intelligence, Research

    Robot calibration. I would define it as the process by which a robot takes knowledge of the actual position in its body of a given part that is important for it (usually, the sensors), in relation to a given frame of reference (usually the body center). For example, where in the robot body is exactly located the stereo camera, from the center of the robot.

    In a perfect world, calibration would never be necessary. The mechanical engineers would design the position of each part and piece of the robot, and hence, everybody would have access to that information just by asking the engineers where did they put that part in the robot.
    However, real life is more interesting than that. Designed positions of robot parts NEVER correspond to the actual position in the real robot. This is due to errors performed during construction, tolerances between parts, or even errors in the plan.

    To handle this uncertainty, the process of calibration was invented. So, EACH ROBOT has to be calibrated after construction, before starting to use it.

    PR2 robot uses a checkboard pattern to calibrate its camera

    All type of robots suffer from having to be calibrated, but in the case of Service Robots the situation is more complex, because its number of sensors and parts involved is larger (a different calibration system must be designed for the calibration of each part).
    Furthermore, given that current AI systems that control the robot relay on a very precise calibration to work correctly (they cannot handle very well noise and error), having a good calibration system is crucial for a Service Robot.

    Current approaches to calibration follow more or less the same approach: the robot is set on a controlled specific environment, and performs some measurements with some of the sensors that need to be calibrated. This is the process followed in hand-eye calibration [1], odometry calibration [2], or laser calibration [3].

    All those processes require the robot to perform some specific actions with a very specific setup.
    The problem arises when, you have many systems to calibrate (for example in a Service Robot), and also, the robot has to be recalibrated from time to time due to changes in the robot structure (robots suffer changes just by using them!).

    Reem robot performs some specific movements to calibrate its arms

    So, a more general approach to calibration has to be designed, that avoids the definition of specific calibrators for each part.
    And this process has also to be a long life calibration system, that allows the robot calibrate by itself without having to use specific set ups (usually only available at specific locations). Summarizing, the robot must learn its sensorymotor space and adapt it as it changes through its whole life.

    Theory towards this end has already been put in place in the work of Philipona and O’Regan [4][5].
    In their work, Philipona and O’Regan propose an algorithm that would allow any robot to learn any sensorimotor system, the relations between sensors and motors, and how they related to the physical world… without any previous knowledge of its body or the space around it!.

    Applying this theory to the calibration of a robot would allow any robot to calibrate itself, independently on where it is (not necessarily at the factory, but may be at the owner’s home), the sensorymotor configuration, and also adapt itself to changes along its whole life, without having to return to the factory or requiring any specific action from the owner.

    At present, such type of calibration system is almost science fiction. I am not aware of anybody using it, but who knows if somebody is already working on this somewhere in the world… may be at Pal Robotics?…

    If you are interested just contact me.

    [1] Optimal hand-eye calibration, Klaus H. Strobl and Gerd Hirzinger, ICRA 2006
    [2] Fast and Easy Systematic and Stochastic Odometry Calibration, A. Kelly, IROS 2004
    [3] Laser rangefinder calibration for a walking robot, E. Krotkov, ICRA 1991
    [4] Is there something out there? Inferring space from sensorimotor dependencies, D. Philipona, J.K. O’Regan, J.-P. Nadal, 2003
    [5] Perception of the structure of the physical world using unknown multimodal sensors and effectors D. Philipona, J.K. O’Regan, NIPS 2003

    Understanding Understanding (For Service Robots)

    // February 18th, 2012 // 1 Comment » // Artificial Intelligence

    The robot looks at the can. Then computes a long list of conditionals: if the pattern over there is like this, then it means that it may be a can of coke with an 80% likelihood. If instead, the pattern looks like that, then it may be a book with a 75% likelihood. If the pattern is like that other one, may be is a car. If none of those patterns are recognized, then move a little bit to the left and look again.

    It may be that we humans work this way when trying to recognize an object. We explore a list of endless conditions (a set for the recognition of each concept we know) that at the end of the chain determine what we see, what we hear or what we feel with a given probability.

    I don’t think so.

    It would take us for ages to recognize everything we are able to recognize if that was the procedure we follow to recognize things!!!.

    There is something missing in that picture that prevents us to create robots able to recognize voices, objects and situations. That is the key element we need to comprehend, in order to create artificial cognitive systems.

    And I believe the important element is understanding.

    Understanding is what allow us to identify what is relevant to the current situation at hands. Is what allows us to focus in what really matters. Is what makes us recognize the can, the book or the speech of your neighbour on a reasonable amount of time. We do not understand because we recognize, we recognize what we understand.

    Understanding compresses the reality into chunks that we can manage. Instead of having to construct a table with all the cases, we understand the situation and by doing it we have compressed that reality into a lower dimension space that can be managed in a reasonable amount of time.

    Without understanding, a machine is not able to discard all the options it has in front of it, and hence, it has to evaluate all the possibilities. That is why for a robotic system to recognize something it has to create a table with all the options and check everyone of them, which one is the one that matches the best for the current situation. Of course you can provide the machine with some tricks (most of them heuristic rules) that help it to discard obvious non useful elements and make it reduce the table significantly. But at the end, those are nothing more than that… tricks.

    There is a paper that claims that checkers game has been solved [1]. By definition, a game is solved if the complete table of movements for a perfect game has been discovered, what allows the machine to exactly know at each step which is the best movement in order to win. But knowing the table and winning always doesn’t mean that the system understands any of the moves, neither the game.

    A good example of what I meant is the one chess game described by Roger Penrose on his book Shadows of the mind [2] and that is depicted in the following figure.

    Deep Blue doesn't understand chess

    As is explained in this Fountain Magazine post:

    In the chess game above, by just playing the king left and right, white can bring the game to a draw. But, at first to make the game a draw, the white player has to understand the situation. Since the computer has no capability to understand, it may think that it would be more profitable to take the castle and therefore it loses the game

    For me, understanding is the main challenge that artificial cognitive systems must face in the close future, if we want them to be working around us as for example Service Robots).

    Of course, for a system that has the whole table of all the possibilities of the Universe, understanding is not necessary. Understanding is only necessary when the resources available are small (compared with the complexity of the problem to solve). The problem of understanding understanding and reproducing it in artificial systems is then the key to successful A.I. systems.

    But it shows that this is a problem a much harder than working on tables and methods to cut the explosion of possibilities. That is why A.I. researchers prefer to concentrate on using bigger computers and finding better selective searches. By doing this you can obtain quicker partial results that you can publish, sell, you name it. Even if those are very difficult things, they are easier than implementing real understanding.

    Current intelligent systems are just cheaters. We provide them a piece of paper with the correct answers. At most, we provide them complex algorithms to decode the correct answers. Answers that we have prepared and know in advance. But when the answer is not on the table, the system gets lost and one can observe its non-understandability.

    Now, what does it mean that a robot understands?. For me, the key element for understanding is prediction.

    You cannot make a robot understand what a wheel is by showing to it a set of different wheels and make it generalize. In the case that we show to the robot a round steel plate, the robot will not identify it as a possible wheel because no example of such type would have been include into the generalization. Instead what we have to do is to provide the robot with prediction skills. This will allow it to predict how such an item will behave in the application at hands, and hence it will classify the plate as a wheel because it can predict that the plate could work round as a wheel.

    We just predict. At any moment. We understand something when we can predict that thing. We understand a scene because we can predict it. We just pick the interpretation of the scene that makes sense following our limited senses. Then we are able to predict using that interpretation. If our prediction fails then we do not understand what is happening, and hence feel lost.

    Having reached this point, the real question is how do we embed this prediction ability into an artificial system and make it use it for creating understanding. Now that’s another story…


    [1] Jonathan Schaeffer, Neil Burch, Yngvi Björnsson, Akihiro Kishimoto, Martin Müller, Robert Lake, Paul Lu, Steve Sutphen, Checkers is solved, Science, 2007

    [2] Roger Penrose, Shadows of the mind, 1994

    Challenges for Artificial Cognitive Systems

    // January 29th, 2012 // No Comments » // Artificial Intelligence, Research

    Along the last weekend (from 20 to 22 February of 2012), an extra workshop organized by the euCognition research network was held at Oxford: the second edition of Challenges for Artificial Cognitive Systems.

    I was interested to attend this workshop in order to figure out answers to the following questions:

    1. Is it necessary to include cognition in artificial systems in order to create one (Service Robot) that can be useful in dynamic human environments?
    2. Which kind of cognitive abilities do we have to include for a particular type of robot?
    3. What are cognitive abilities anyway?

    The workshop was planned as a set of discussion sessions. The environment allowed discussion and debate. People that attended really wanted to find answers. The result, was a very dynamic and interesting debate about all those questions and other related.

    First, we discussed about the kind of cognitive skills an artificial system would need. This lead to the following basic list:

    1. Ability to interact with humans in a natural way (whatever this means)
    2. Adapt and learn from the environment
    3. Able to achieve a goal autonomously

    This list answered on a broad sense the third question in my mind (what are cognitive abilities?). However, I did not feel completely satisfied with those definitions and still need something more concrete…

    We discussed then how can we measure progress in those goals without allowing to cheat. This means, how can we create a kind of scale that measures how much of those points has been achieved and up to which degree, identifying at the same time that the system showing those abilities is not using something else that is not cognitive (like for example a table with all the possible states of the system and the answer to provide to each one).

    Related to this last point, we saw the necessity to identify which kind of tasks require cognition in order to be solved, and which ones don’t. This point was related to my two first questions, and can be otherwise stated as which classes of problems require cognitive abilities in order to be solved?. Of course, no satisfactory answer was provided and the issue remains open for next meetings (and years).

    Apart from what I have mentioned above, I found the following point a very important observation made at the meeting:
    current artificial intelligent systems have reached a plateau of improvement. It looks like a way to move away from that plateau is to incorporate more cognitive abilities to our artificial intelligent systems.

    I did not feel quite happy with neither the definition of cognitive system nor of the list of abilities required for such a system. And the reason is that for me, understanding is the only and basic cognitive ability upon anything else a robot must construct. By means of understanding, a system is able to acquire meaning through its interaction with the environment, and use this meaning to survive, adapt, learn and generate its own goals (you can read what I mean by understanding at this blog post). I think that understanding, and only this is the special characteristic that defines a system as cognitive. Unfortunately, current artificial systems have almost no understanding.

    If you are interested, the official results are published at the wiki of the euCognition project.

    Compliance: trending topic at the Humanoids 2011

    // November 7th, 2011 // No Comments » // Artificial Intelligence, Research, Work

    Compliant robot: a robot with the ability to tolerate and compensate for misaligned parts. Or otherwise stated, the ability of the robot to gracefully absorve an external force that tries to modify its position.

    At the last Humanoids conference ( everybody was talking about how to control a compliant arm, how to build compliant legs and how to move a compliant humanoid.
    We introduced our latest Reem robot to the scientific community and, besides the typical question about how much the robot costs, the top number one of the questions was: is your robot compliant?. Some people even crashed their bodies against the robot in order to check if the robot had compliant arms!.
    No, our robot is not compliant… yet.

    Of course, compliance is a very important feature for a service robot because we must be sure that a robot that works with humans will not harm a person. Hence, if someone crashes against the robot (or viceversa), we, as builders of the robots, must ensure that nobody gets hurt.

    Some other robots in the world have already shown very nice compliant characteristics. This is the case of the Meka robots. You can watch a nice video here, where the robot shows its compliance.

    Another case is the omnipresent PR2, where in this video shows how compliance can be useful for cooperation.

    However, at present, compliance has its dark side. Due to the fact that a compliant robot must be able to absorb forces, a compliant joint cannot distinguish between situation of crash or a situation of carrying a heavy load. A compliant joint will react in the same way to both situations, that is, letting the joint move on the opposite direction of the force. If the robot were carrying a weight, it would fall off.

    This reminds me the training of Chi Sao while doing Wing chun kung fu. In this training, two opponents try to feel the force one of them is doing against the arms of the other, and use it to generate a better attack. The basics of this trainning is to learn to differentiate when you have to push and when you have to diminish.

    We suffered the same kind of training when we were babies in order to understand the differences between carrying or being pushed.

    The compliant robot is still far from encoding that knowledge. The point is more delicated that just using a flag that indicates when the robot is in carrying mode or when in free mode to absorb collisions (that would be the GOFAI solution). It is necessary to embed into robots a more complex ability that makes them know when they are in one situation or the other.

    And that ability is understanding. The robot needs to understand when a force in its body is due to a crash or when is due to an object been carried.

    Understanding is the most important feature for a robot, and not only for compliance but for everything. At present, no robot in the world understands a … eemm… anything…

    Though work in front of us!