Reference Manual: DSL bs

From DSL
Jump to: navigation, search

This class implements a Bayesian search learning procedure that uses random restarts.

Methods

  • DSL_bs()

Constructor for the class DSL_bs. The constructor creates the object with the default values of the parameters. The parameters, which are public members of the object are:

int maxParents Used to limit the maximum number of parents a node can have. Setting this too high may cause memory problems.

int nrIteration Sets the number of searches (and indirectly, the number of random restarts) the algorithm performs.

double priorLinkProbability Defines a prior for existence of an arc between two nodes. This prior probability is used for all possible arcs. Setting a low probability will create a (prior) bias towards finding more sparse networks, setting a high probability will create a bias towards more sparse networks.

int priorSampleSize Influences the "strength" of priorLinkProbability. Bigger values of priorSampleSize increase the influence of the prior. This is more noticeable with smaller data sets.

int maxSearchTime Used to set maximum runtime for the algorithm.

unsigned int seed Random seed used for generating random starting points (networks) for Bayesian search.

double linkProbability Used for the random network generator that generates the starting networks. It defines the probability of having an arc between two nodes. Setting a low probability will lead to sparser (starting) networks, setting it higher will lead to denser (starting) networks.

In the structure DSL_bkgndKnowledge bkk we have the following fields, which are used to describe background knowledge:

IntPairVector forcedArcs, vector of pairs of indices (parent, child) denoting arcs that will be present in the graphical structure. Variables indices correspond to the indices of the DSL_dataset.

IntPairVector forbiddenArcs, vector of pairs of indices (parent, child) denoting arcs that will be absent in the graphical structure. Variables indices correspond to the indices of the DSL_dataset.

IntPairVector tiers, vector of pairs, where integers in the pair mean (variable, tier number). Tier numbers start from 1. Variables indices correspond to the indices of the DSL_dataset.


DSL_tan() sets the parameters to the following default values:

• maxParents = 5

• maxSearchTime = 0

• nrIteration = 20

• linkProbability = 0.1

• priorLinkProbability = 0.001

• priorSampleSize = 50

• seed = 0


  • int Learn(DSL_dataset &data, DSL_network &net);

This method performs the actual learning procedure. The first argument is the input data set. The result of learning procedure is stored in the DSL_network, which is the second argument. The method returns DSL_OKAY if learning was successful and an error code otherwise.

Personal tools