Various arguments can be specified at the command line to execute ML-Flex in custom ways. Two of these arguments are mandatory, while the remaining arguments are optional. Below are a few examples of how these parameters can be specified in executing ML-Flex. (Please note also that an -Xmx argument is passed to Java. This argument allows the user to increase the amount of memory that is available to Java, which can be crucial in processing large data sets.)
Below is a list of all command-line arguments, along with examples of how to use them.
Name | Description |
EXPERIMENT_FILE | This setting requires the user to specify a path to an experiment file. The name of this file will be used as the experiment name, and the file should contain settings for an ML-Flex experiment (see Creating an Experiment File). |
ACTION | This setting requires the user to specify one of actions that can be performed when ML-Flex executes:
|
Name | Description | Default |
DEBUG | As ML-Flex executes, it outputs logging information to standard out and to a Log.txt file. Additional logging information can be output when DEBUG is set to true. This can be useful when an error has occurred to aid in troubleshooting. By default, debugging is turned off to avoid computational and storage overhead. | false |
NUM_THREADS | ML-Flex uses the Java threading capability to execute computing tasks in parallel. With this setting, the user can specify the maximum number of threads per computing node that can be used by ML-Flex. If ML-Flex seems to be running slowly on large data sets, it may be that this value is too high. | The number of processors available to the Java virtual machine on the computer on which ML-Flex is executed. |
THREAD_TIMEOUT_MINUTES | ML-Flex uses the Java threading capability to execute computing tasks in parallel. For a variety of reasons, a thread may "hang" and not return a result. Thus it may be desirable to specify a timeout period after which ML-Flex will abandon a thread and retry executing the task. It is recommended that this setting be longer than the longest time that any given feature selection or classification task is expected to take. | 60 |
PAUSE_SECONDS | When ML-Flex attempts to execute tasks across multiple computing nodes, it may identify a situation where a processing task remains to be performed and it appears that another thread is currently executing that task. In most cases, this is truly because the task is being executed, so the current thread will pause for a short time and wait to see if the other thread has completed processing. If so, the current thread will move on to the next set of tasks. Otherwise, the current thread will pause again, and this process will repeat until the thread timeout has occurred (after which the corresponding lock file will be deleted and the current thread will attempt to execute the task). The PAUSE_SECONDS configuration value specifies the number of seconds that each pause will last. | 60 |
EXPORT_DATA | This setting accepts either "true" or "false" as a value. When set to true, the data that have been processed by ML-Flex will be exported to multiple formats (currently, tab-delimited and ARFF) to enable the user to perform downstream analyses if desired. | false |
MAIN_DIRECTORY | It is possible to store the ML-Flex executable files in one location and the data files in a different location. This setting allows you to specify where the data files are stored. By default, the executable files are stored in the same location as the data files. This setting will likely be used rarely. | Same location as the executable files. |
LEARNER_TEMPLATES_FILE | This file is used to store information about how ML-Flex can interface with third-party machine-learning packages. By default, this file is located at Config/Learner_Templates.txt file. However, an alternative file can be specified using this parameter. | Config/Learner_Templates.txt |
CLASSIFICATION_ALGORITHMS_FILE | By default, classification algorithms are configured in the Config/Classification_Algorithms.txt file. However, an alternative file can be specified using this parameter. | Config/Classification_Algorithms.txt |
FEATURE_SELECTION_ALGORITHMS_FILE | By default, feature-selection algorithms are configured in the Config/Feature_Selection_Algorithms.txt file. However, an alternative file can be specified using this parameter. | Config/Feature_Selection_Algorithms.txt |
List of Command-line Arguments
Executing Experiments Across Multiple Computers
Third-party Machine Learning Software