Reference Manual: DSL bs
This class implements a Bayesian search learning procedure that uses random restarts.
Methods
- DSL_bs()
Constructor for the class DSL_bs. The constructor creates the object with the default values of the parameters. The parameters, which are public members of the object are:
• int maxParents Used to limit the maximum number of parents a node can have. Setting this too high may cause memory problems.
• int nrIteration Sets the number of searches (and indirectly, the number of random restarts) the algorithm performs.
• double priorLinkProbability Defines a prior for existence of an arc between two nodes. This prior probability is used for all possible arcs. Setting a low probability will create a (prior) bias towards finding more sparse networks, setting a high probability will create a bias towards more sparse networks.
• int priorSampleSize Influences the "strength" of priorLinkProbability. Bigger values of priorSampleSize increase the influence of the prior. This is more noticeable with smaller data sets.
• int maxSearchTime Used to set maximum runtime for the algorithm.
• unsigned int seed Random seed used for generating random starting points (networks) for Bayesian search.
• double linkProbability Used for the random network generator that generates the starting networks. It defines the probability of having an arc between two nodes. Setting a low probability will lead to sparser (starting) networks, setting it higher will lead to denser (starting) networks.
In the structure DSL_bkgndKnowledge bkk we have the following fields, which are used to describe background knowledge:
• IntPairVector forcedArcs, vector of pairs of indices (parent, child) denoting arcs that will be present in the graphical structure. Variables indices correspond to the indices of the DSL_dataset.
• IntPairVector forbiddenArcs, vector of pairs of indices (parent, child) denoting arcs that will be absent in the graphical structure. Variables indices correspond to the indices of the DSL_dataset.
• IntPairVector tiers, vector of pairs, where integers in the pair mean (variable, tier number). Tier numbers start from 1. Variables indices correspond to the indices of the DSL_dataset.
DSL_tan() sets the parameters to the following default values:
• maxParents = 5
• maxSearchTime = 0
• nrIteration = 20
• linkProbability = 0.1
• priorLinkProbability = 0.001
• priorSampleSize = 50
• seed = 0
- int Learn(DSL_dataset &data, DSL_network &net);
This method performs the actual learning procedure. The first argument is the input data set. The result of learning procedure is stored in the DSL_network, which is the second argument. The method returns DSL_OKAY if learning was successful and an error code otherwise.