Reference Manual: DSL textParser

From DSL
Revision as of 16:31, 25 November 2010 by WikiSysop (Talk | contribs)

(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to: navigation, search

This is a utility class that implements a simple parser for text files. The current implementation is not necessarily efficient and, therefore, for large text files we encourage a specialized user implementation.


This method has been removed from the API see [SMILE/SMILearn (C++): change notes] Use instead :

  DSL_dataset dataset;
  std::string filename;
  if (!dataset.ReadFile(filename))   _error_message();


  • DSL_textParserParams

DSL_textParserParams is a structure that stores settings for the DSL_textParser. It has the following fields and their default values:

bool useHeader = true, defines if the first row in the text file contains the labels of the variables.

bool createPreprocessFile = false, when set to true, the parser produces a text file with parsed information. The file carries the name of original data file and extension .glw.

std::string textFileName = “”, name of a text file that contains the data.

std::string missingValue = “*”, the marker in the data that corresponds to a missing element.

bool typesSpecified = false, the marker that indicates if in the header of the data types are specified. There are three possible keywords: ordinal, discrete, and continuous. By default, the parser assumes the discrete type (values are treated as labels).

  • DSL_textParser()

DSL_textParser(const DSL_textParserParams &params)

The constructors for the file parser. The first constructor uses the default values of the DSL_textParserParams, while the second allows the user to specify values explicitly.

  • int Parse();

This method performs actual file parsing and creates internal data structures. Every time this method is called, the file parsing takes place. It returns DSL_OKAY if the operation was successful, and an error code in case of an error.

  • void GetDataset(DSL_dataset &here)

This method allows obtaining a data set from parsed file. The method Parse() should be called prior to this method.

Personal tools