KEMBAR78
RapidMiner Data Types | PDF | Cluster Analysis | Data Type
0% found this document useful (0 votes)
641 views4 pages

RapidMiner Data Types

The document describes the different data types that RapidMiner assigns to attributes, including numeric, nominal, date_time, integer, real, and more. It also lists the common port abbreviations used in RapidMiner operators and provides a brief description of each port.

Uploaded by

Satyajeet Gaur
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
641 views4 pages

RapidMiner Data Types

The document describes the different data types that RapidMiner assigns to attributes, including numeric, nominal, date_time, integer, real, and more. It also lists the common port abbreviations used in RapidMiner operators and provides a brief description of each port.

Uploaded by

Satyajeet Gaur
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 4

RapidMiner data types

The following terms describe the data types RapidMiner assigns to attributes. Defining a data
type specifies the kind of values allowed for an attribute. RapidMiner supports the natural
division of numbers, texts, and dates. Numeric is the label for numbers, nominal for texts or
strings, and date_time for dates.

attribute

Parent of all possible types ("any type").

binominal

Exactly two values (for example true/false or yes/no).

date

Date without time (for example 23.12.2014).

date_time

Both date and time (for example 23.12.2014 17:59).

file_path

Nominal data type (rarely used) that allows for more granular distinction. Can be used to
mark a column as "only containing file paths."

integer

A whole number (for example, 23, -5, or 11,024,768).

nominal

All kinds of text values; includes polynomial and binomial.

numeric

All kinds of number values; includes date, time, integer, and real numbers.

polynominal

Many different string values (for example red, green, blue, yellow).

real

A fractional number (for example 11.23 or -0.0001).


text

Nominal data type that allows for more granular distinction (to differentiate from
polynomial).

time

Time without date (for example 17:59).

Operator port information


Ports

The point through which data moves, represented by a semicircle labeled icon on the sides or
operators and the Design view. See the list of port abbreviations below.

To see your filtered example set, connect the Output (out) port of the Retrieve operator to the
ExampleSet (exa) port of Filter Examples. Then, connect the ExampleSet (exa) port on
Filter Examples to the Results (res) port at the right of the Process view and click Run.

The following table lists each port abbreviation and provides a brief description.

Port
Meaning Description
Abbreviation
ano Anova ANOVA matrix for ANOVA significance test
ann Annotation Annotations extracted from the input object
arc Archive Archive file generated during execution of the operator
Association rules that have been discovered in a frequent
ass Association
item set
att Attribute Attribute weights (in and out)
Performance measures; estimate of performance using the
ave Average
model built on the complete delivered data set
clu Cluster model Cluster model created when clustering an example set
Example set given to the clustering operator; may contain
clu Clustered set an attribute with a cluster role (describes the cluster of
each example)
col Collection Collection of objects
Any object can be supplied; the condition specified in
con Condition
parameters is tested on this object
cov Covariance Covariance matrix
dic Dictionary Example set used for replacing 'from' values with 'to'
Port
Meaning Description
Abbreviation
values in a given example set
Distance
dis SimilarityMeasure object
measure
doc Document Document or document set
err Error Standard error output
Estimated Performance vector of the SVM model which gives an
est
performance estimation of statistical performance of this model
exa Example set Example set
fil File File object
fla Flat Flat collection or flat clustering model
for Formula Formula result
fre Frequent Frequent item or item sets for association rule learning
gro Grouped Grouped models, attributes, items
hie Hierarchical Hierarchical clustering model
inp Input Input source, can take various objects
Frequent item sets (groups of items that often appear
ite Item sets
together in the data)
joi Join Join of the left and right example sets
Model that was given in input is applied on the example
lab Labeled data set and the updated example set is delivered from this
port
Left input port expecting an example set, which is used as
lef Left
the left example set for a join
lif Lift chart Lift Pareto chart for the given model and example set
Correlations matrix of all attributes of the input example
mat Matrix
set
mer Merged Merged example set
mod Model Default model from this output port
obj Object IO object
ori Original Input example set is passed without changing to this port
out Output Output port
par Parameter set Set of parameters that can be applied on an operator
GSP algorithm is applied on the given example set;
pat Patterns resultant sequential patterns set is delivered through this
port
per Performance Performance Vector for selected attributes
Preprocessing model with information regarding the
pre Preprocessing
operator's parameters in the current process
ran Random forest Model of a random forest
ref Reference Provided reference data or reference set
req Request set Provided example set
Distance or similarity between examples of the request
res Result set
set and reference set
Port
Meaning Description
Abbreviation
Right input port expecting an example set, which is used
rig Right
as the right example set for a join
roc ROC curve Calculated ROC curves for included models
Association rules that have been discovered in a frequent
rul Rules
item set
Input take an example set derived from the output of the
sec Second
Generate ID operator in an attached example process
seg Segment Segment of an image
Object specified by the index parameter is returned
sel Selected
through this port
ses Session Session example set
Significance test results of performance vector
sig Significance
comparison is delivered through this port
Calculated similarity between each example of the given
sim Similarity
example set with every other example of the same set
Single object of the given collection, which is processed
sin Single
in the inner part of the operator
sta Stacking Stacking examples or model
Through this port, the input object is passed without
sto Stored
changing to the output
Expects an example set; example set must have ID
sub Subtrahend
attribute
sup Superset Superset of input example sets
thr Through Objects are passed through without changing
thr Threshold Threshold output of the Select Recall operator
tra Training Training data to train a model (example set)
uni Union Union of the input example sets
Examples that are not labelled and therefore not used
unl Unlabeled
when training a model
Examples that did not match a specified pattern in the
unm Unmatched
original example set
Examples that were unrelated to a specified pattern in the
unr Unrelated
original example set
vis Visualization Self-organizing map (SOM) visualization
wei Weights Attribute weights
wor Word Expects or outputs a word list
xsl XSLT EXtensible Stylesheet Language (XSLT) document

You might also like