· • • •
DBDT • • • ·
Decision Tree Learning
2004-2010 The DMIP Group:
Estruch-Gregori,Vicent; Ferri-Ramírez,Cèsar; Hernandez-Orallo,José; Martínez-Plumed,Fernando;
DBDT is a machine
learning algorithm that integrates decision tree learning and center splitting.
Roughly speaking, the inferred classifer can be viewed as a tree of attribute
prototypes (The value distribution of an attribute is represented by a set of
prototypes.). An instance is linked to one prototype or other depending on its
ProbDBDT is a variation of DBDT that uses probabilities-based distances.
Newton Trees based on DBDT framework, is a redefinition of probability estimation trees (PET) based on a stochastic understanding of decision trees that follows the principle of attraction (relating mass and
distance through the Inverse Square Law). The structure, application and, very especially, the graphical representation of these Newton trees
provide a way to make their stochastically driven predictions compatible with user's intelligibility, so preserving one of the most desirable features
of decision trees, comprehensibility.
You can download the whole
system package for academic use with the following conditions:
& COPYRIGHT: The software has been checked on a several Intel-based
machines (PCs) under different versions of Ms. Windows (2000,XP). In
this regard, you can make any modification to the software, provided you always
make the changes explicit and refer to its original authors. Obviously, we are
not responsible for any damage caused by the use or misuse of this software. If
you find any bug please contact the authors. For commercial use *do* contact
Source and Executable Code:
- Windows stable version: DBDT , ProbDBDT & Newton Trees
: DBDT has been implemented in JBuilder using the WEKA
libraries. The GUI uses a non-standard Java layout being the compatibility
with other platforms not ensured. A release JBuilder version is usually
- Select the Load Files panel to load a new data set (see figure below). Two
files are required, an *.arff file and the metric_space.txt (This one
stores the distance information). Just in case the latter one does not
exist, the system can authomatically compute it. Click on the second File
menu entry ( Save Ms File from arff) for this purpose.
- Once the sample file is loaded, run the DBDT algorithm. Select the Experimenter panel and
click on the Run button. Remember setting up those parameters in which you are interested
(Accuracy or/and AUC).
- To save the results in a file, press the Save As
button. The Clear button cleans the edition area.
Note that the Weka
classes are requiered as well. The aplication link them from a default path
). Of course, the latest version of the library can be obtained from the
Many example datasets in DBDT format (*.arff file + metric_space.txt) can be found here . If you have no
examples in DBDT format, please download them because they will be required.
How should it look like?
After loading and runing the
Java project, the look should be as follows.
- Several cross validation.
- Detailed perfomance achieved in each fold (mistakes, ratio, etc.).
- Accuracy and AUC (only for two class problems) evaluation.
- Methods ID3 and C4.5 (prunning option is disabled) are also
invokable for non-structured problems.
- Distance-based criterion for instance classification (proximity).
- Naive density-based criterion for instance classification
- Number of children for each node.
- Enhance the comprehensibility of the model.
- Include prunning techniques.
- Develope heuristic functions based on MML/MDL criteria.
- Acces to data bases.
- Program a distributed version of the DBDT algorithm.
© 2004-2010 Vicent Estruch-Gregori ,
Fernando Martínez Plumed