(-: (-:
SMILES
:-):-)
A Stunning Multi-purpose Integrated
LEarning System
Copyright 2001-2003 The MIP Group: Ferri-Ramírez,
Cèsar; Hernandez-Orallo,
José; Ramirez-Quintana,
M.José
Presentation:
SMILES is a machine learning system that integrates
many different features from other machine learning techniques and
paradigms and, more importantly, it presents several innovations in
almost all of these
features. In particular, SMILES
extends classical decision tree learners in many ways (new splitting
criteria, non-greedy search, new partitions, extraction of several and
different solutions),
it has an anytime handling of resources, and has a sophisticated and
quite
effective handling of (misclassification and test) costs. In this way, SMILES combines and improves the recent interest in
hypotheses combination (e.g. boosting) and cost-sensitive learning (a
priori and a posteriori class assignments, ROC analysis) outperforming
previous systems in many situations. Its applications are basically
data-mining and any other machine learning task where decision trees
could be useful. It works as a classifier or a probability estimator,
especially designed for ranking (ROC analysis and AUC measures).
SMILES can
also extract comprehensible models from ensembles of classifiers or
from neural networks (through mimicking).
The System:
You can download the whole system package for academic use with the following
conditions:
DISCLAIMER
& COPYRIGHT: The software has been checked on a several Intel-based
machines
(PCs) under different versions of Linux and Ms. Windows. It
'should'
compile on any other system with an ISO C++ compiler possibly under
slight
modifications. In this regard, you can make any modification to the
software,
provided you always make the changes explicit and refer to its original
authors. Obviously, we are not responsible for any damage caused by the
use or misuse of this software. If you find any bug please contact the
authors.
For commercial use *do* contact the authors.
Source and Executable Code:
Versions for different platforms:
- UNIX stable version (Linux): SMILES v.2.3.7 UNIX: Once downloaded and decompressed, follow the
readme.txt file and run the shell-script for installation on Unix-like
machines (or
just run make). It includes the
C++ sources. The latest version (under development) is SMILES v.2.6.7 UNIX.
- Windows stable version (Console
Application for any Win32 operating system): SMILES v.2.3.6
Windows: Program executable compiled with Borland C++ Builder 5.0.
Just decompress the zip file and follow the readme.txt file. (Thanks
to Ricardo Blanco for this Windows version). The latest version (under development) is SMILES v.2.6.7 Windows.
The system has been successfully compiled
in Linux GNU g++ compiler (KDD environment), Borland C++ Builder 5.0
and Microsoft Visual C++ 5.0,
so it may be easily portable to any other platform not listed above (in
this
case download the UNIX version that includes the source files).
MANUAL:
Completely indispensable if you want to
take full advantage of SMILES: manual in
PDF format.
SAMPLE DATASETS:
Many example datasets in SMILES format can be found here
(more than 5Mbs). If you have no examples in SMILES format, please
download them because they will be required.
How should it look like?
If the installation is successful, you can
directly type:
smiles -?
and you will have the following usage
information:
**** SMILES v.2.6.7 (Release Date: 6-September-2003) ****
USAGE:
smiles file.train [file.test] [file.cost] [file.testcost] <file.oracle>
Enjoy SMILES. Let you be surprised by SMILES's learning abilities.
Current Features:
- Multitree construction. An AND/OR
tree structure is constructed, rather than a forest.
- Several Solutions can be selected
from the multitree
- Solutions can be combined in many
different ways.
- The extraction of an archetype of
the multitree is possible.
- Classical features: such as
several splitting criteria, pruning, use of expected error and
smoothing options (Laplace, m-estimate).
- Full validation and evaluation
statistics (e.g. cross-validation)
- Cost-sensitive and ROC Analysis
Features: ROC based and AUC-based splitting criteria and evaluation.
- Test cost facilities.
- Mimetic methods.
- Probability estimation.
Future Features:
- Inputs and output in XML format
- Incremental extension.
- Functional-logic partitions and
higher-order extension.
© 2002-2003 José Hernández Orallo.