The configuration process performed by irace will show at the end of the execution one or more configurations that are the best performing configurations found. This package provides a set of functions that allow to further assess the performance of these configurations and provides support to obtain insights about the details of the configuration process.
To use the methods provided by this package you must have an irace data object, this object is saved as an Rdata file (irace.Rdata by default) after each irace execution.
During the configuration procedure irace evaluates several candidate configurations (parameter settings) on different training insrances, creating an algorithm performance data set we call the training data set. This information is thus, the data that irace had access to when configuring the algorithm.
You can also enable the test evaluation option in irace, in which a set of elite configurations will be evaluated on a set of test instances after the execution of irace is finished. Nota that this option is not enabled by default and you must provide the test instances in order to enable it. The performance obtained in this evalaution is called the test data set. This evaluation helps assess the results of the configuration in a more “real” setup. For example, we can assess if the configuration process incurred in overtuning or if a type of instance was underrepresented in the training set. We note that irace allows to perform the test evaluations to the final elite configurations and to the elite configurations of each iterations. For information about the irace setup we refer you to the irace package user guide.
Note: Before executing irace, consider setting the test evaluation option of irace.
Once irace is executed, you can load the irace log in the R console as previously shown.
The irace plot package provides several functions that display
information about configurations. For visualizing individual
configurations the parallel_coord
shows each configuration
as a line.
The plot_configurations()
function generates a similar
parallel coordinates plot when provided with an arbitrary set of
configurations without the irace execution context. For example, to
display all elite configurations:
all_elite <- iraceResults$allConfigurations[unlist(iraceResults$allElites),]
plot_configurations(all_elite, iraceResults$scenario$parameters)
A similar display can be obtained using the parallel_cat
function. For example to visualize the configurations of a selected set
of parameters:
parallel_cat(irace_results = iraceResults,
param_names=c("algorithm", "localsearch", "dlb", "nnls"))
The sampling_pie
function creates a plot that displays
the values of all configurations sampling during the configuration
process. The size of each parameter value in the plot is dependent of
the number of configurations having that value in the
configurations.
sampling_pie(irace_results = iraceResults, param_names=c("algorithm", "localsearch", "alpha", "beta", "rho"))
Note that for some of the previous plots, numerical parameters domains are discretized to be showm in the plot. Check the documentation of the functions and the user guide to adjust this setting.
The package provides several functions to visualize values sampled during the configuration procedure and their distributions. These plots can help identifying the areas in the parameter space where irace detected a high performance.
A general overview of the sampled parameters values can be obtained
with the sampling_frequency
function which generates
frequency and density plots for the sampled values:
If you would like to visualize the distribution of a particular set
of configurations, you can pass directly a set of configurations and a
parameters object in the irace format to the
sampling_frequency
function:
sampling_frequency(iraceResults$allConfigurations, iraceResults$scenario$parameters, param_names = c("alpha"))
A detailed plot showing the sampling by iteration can be obtained
with the sampling_frequency_iteration
function. This plot
shows the convergence of the configuration process reflected in the
sampled parameter values.
To visualize the joint sampling frequency of two parameters you can
use the sampling_heatmap
function.
The configurations can be provided directly to the
sampling_heatmap2
function. In both functions, the
parameter sizes can be used to adjust the number of intervals to be
displayed:
sampling_heatmap2(iraceResults$allConfigurations, iraceResults$scenario$parameters,
param_names = c("localsearch","nnls"), sizes=c(0,5))
For more details of these functions, check the documentation of the functions and the user guide.
You may like to have a general overview of the distance of the
configurations sampled across the configuration process. This can allow
you to assess the convergence of the configuration process. Use the
sampling_distance
function to display the mean distance of
the configurations across the iterations of the configuration
process:
Numerical parameter distance can be adjusted with a treshold
(t=0.05
), check the documentation of the function and the
user guide
for details.
The test performance of the best final configurations can be
visualized using the boxplot_test
function.
Note that the irace execution log includes test data (test is not enabled by default), check the irace package user guide for details on how to use the test feature in irace.
To investigate the difference in the performance of two
configurations the scatter_test
function displays the
performance of both configurations paired by instance (each point
represents an instance):
Visualizing training performance might help to obtain insights about the reasoning that followed irace when searching the parameter space, and thus it can be used to understand why irace considers certain configurations as high or low performing.
To visualize the performance of the final elites observed by irace,
the boxplot_training
function plots the experiments
performed on these configurations. Note that this data corresponds to
the performance generated during the configuration process thus, the
number of instances on which the configurations were evaluated might
vary between elite configurations.
To observe the difference in the performance of two configurations
you can also generate a scatter plot using the
scatter_training
function:
To plot the performance of a selected set of configurations in an
experiment matrix, you can use the boxplot_performance
function. The configurations can be selected in a vector or a list
(allElites):
boxplot_performance(iraceResults$experiments, allElites=list(c(803,808), c(809,800)), first_is_best = TRUE)
In the same way, you can use the scatter_perfomance
function to plot the difference between configurations:
Note that there these functions can be adjusted to display differently the configurations (i.e. include or not instancs). Check the package user guide and the documentation of each function for details.
In some cases, it might be interesting have a general visualization
for the configuration process. This can be obtained with the
plot_experiments_matrix
function:
The sampling distributions used by irace during the configuration
process can be displayed using the plot_model
function. For
categorical parameters, this function displays the sampling
probabilities associated to each parameter value by iteration (x axis
top) in each elite configuration model (bars):
For numerical parameters, this function shows the sampling distributions associated to each parameter. These plots display the the density function of the truncated normal distribution associated to the models of each elite configuration in each instance: