Reference Manual: DSL dataset

From DSL

Jump to: navigation, search

The class DSL_dataset is a basic container for data and objects of this class are used for passing data between data sources and preprocessing and learning classes.


Methods

  • DSL_dataset()

This default constructor creates an empty data set with no variables. To add variables one can use Reshape or AddVariable methods.


  • DSL_dataset(const std::vector<DSL_variableInfo> &vars)

This constructor creates a data set with variables defined by the first argument. The order of variables is preserved and no records are created.


  • bool Reshape(const std::vector<DSL_variableInfo> &vars)

This method allows to add variables to the data set. If there were any variables or records prior this method is called, they will be destroyed. If error occurred, this method will return false.


  • bool AddRecord(const std::vector<DSL_dataElement> &here)

This method allows to add a record to a data set. The variables should be defined prior to calling this method and number of variables should be consistent with the size of the record. If any error occurs, this method will return false.


  • bool AddIntVar(const std::string id = std::string(), const std::vector<int> *data = NULL, int missingValue = DSL_MISSING_INT)
  • bool AddFloatVar(const std::string id = std::string(), const std::vector<float> *data = NULL, float missingValue = DSL_MISSING_FLOAT)

These two methods allow for adding a variable to the data set. The variable is always added at the end of the variable list. If the size of vector of elements in the first argument is different than the number of records in the data set an error will occur. If the error occurs the method will return false.


  • int NumVariables() const

This method returns the number of variables (columns) in the data set.


  • int NumRecords() const

This method returns the number of records (rows) in the data set.


  • DSL_dataElement& At(int v, int i)
  • DSL_dataElement At(int v, int i) const

Two access methods for a single element. The first argument corresponds to an index of a variable, the second of a record. Indices start from 0.


  • bool GetRecord(int i, std::vector<DSL_dataElement> &here) const

An access method that allows to read a whole record (row) from a data set. The first argument is an index of a record (starting from 0) and the second argument is a user-supplied container, that will be filled with elements (old content will be destroyed). The method returns false if an error occurred.


  • bool IsMissing(int v, int i) const

This method allows to test if an element in the data set is missing. The first argument corresponds to an index of a variable, the second of a record. Indices start from 0. The method returns true if the element is missing, otherwise false.


  • bool IsDiscrete(int v) const

This method allows to test if a variable is discrete. The argument corresponds to an index of a variable and starts from 0. The method returns true if the variable is discrete, otherwise false.


  • int FindVariable(const std::string &id) const

This method returns the index of the first variable (ids of variables in a DSL_dataset are not guaranteed to be unique) of the given id. If such there is no variable with such id in the data set, the method will return -1.


  • void CleanUp()

This method resets the data set. All information stored in the data set will be lost, its status will be equivalent to the state after creation with the default constructor DSL_dataset().


  • bool GetVariableInfo(int v, DSL_variableInfo &here) const

This method allows to access information about a variable. The first argument is an index of variable (starting from 0) and the second argument is a reference to a structure DSL_variableInfo provided by the user. If an error occurs, the method returns false.


  • void GetVariablesInfo(std::vector<DSL_variableInfo> &here) const

This method allows to access information about a variables. The argument is a reference to a vector of DSL_variableInfo structures provided by the user.


  • const std::vector<DSL_dataElements> &GetVariableData(int v) const

This method allows to directly access the vector of data elements for a variable. The argument is an index of a variable (starting from 0).


  • bool SetVariableData(int v, const std::vector<DSL_dataElement> &newData)

This method allows to replace existing data for a variable indexed by v with data provided in newData. The method will return false if index v is incorrect or the number of elements in newData is different than the number of records in the data set.


  • std::string GetId(int v) const
  • bool SetId(int v, std::string newId);

These methods allows to obtain/change id of a node indexed by v.


  • int GetHandle(int v) const
  • bool SetHandle(int v, int handle)

These methods allows to obtain/change handle of a node indexed by v.


  • std::vector<std::string> GetStateNames(int v) const
  • bool SetStateNames(int v, std::vector<std::string> &newNames)

These methods allows to obtain/change state names of a node indexed by v.


  • DSL_dataElement GetMissingValue(int v) const
  • bool SetMissingValue(int v, DSL_dataElement &newMissing)

These methods allows to obtain/change marker for the missing value of a node indexed by v.

Personal tools