A classification procedure (classifier) orders a plurality of objects into several abstract sets called classes. Classification schemes are important communication and data reduction tools used in many professional domains - see for instance the International Federation of Classification Societies web site.

 

 

Classification is so deeply embedded in the human thinking that a philosopher of Plato’s stature was needed to draw the fine line between reality and “ideas”. The definition of a “class” is elusive, for the truth is that a class is mostly defined not but what it is, but by what it is not, by its difference(s) to other classes.

A taxonomy is a (hierarchical) set of classes. Classes can be defined via a set of rules, a full text Boolean search expression, or through its (statistically evaluated) content. It is common to distinguish between clustering and supervised classification schemes.

Clustering is a procedure that groups together similar objects without resorting to outside help. Classification per se assumes the existence of a standardized taxonomy and of a supervising body, which can decide accurately to which class(es) a given object belongs.

Learning algorithms are mathematical models which adapt the inherent model parameters of a classification system in order to fit either examples whose class is known or/and large sets of examples whose class is not known. Such algorithms are in many ways inspired by attempts to understand the human learning processes. R-EF employees have a long and successful history in developing new, effective learning algorithms.

Learning (either by a child or by a machine) is a wonderfully creative process. Taking a simplified geometric view, one can see learning as computing boundaries between high dimensional regions of the feature space describing the important properties of objects. Learning is then the process of creating and continuously refining those walls.

R-EF has developed an extremely flexible and powerful classification system: the "Full Metal" is a  Bayesian classifier allowing for almost instant learning, a precise classification of unknown objects, as well as adapting to slow changes in the class content or in the environment influencing the object’s features. This allows for a continuous performance increase during use, without external help. While the R-EF classification engine itself is universal, its use in specific applications, like image-, natural text-, and genetic sequence data is based on powerful models and (automatic) feature extraction methods.