Categorization


<< Click to show table of contents >>

Navigation:  Modules and plug-ins > 2analyze >

Categorization


The following comprises instructions for the Categorization plug-in, as well as an example use case to provide more detailed step-by-step instructions.

 

Contents

1. Introduction to the plug-in

 1.1 Function

 1.2 Features

 1.3 Position in the Overall Software Package

         1.3.1 Parent Module

         1.3.2 Links to other Plug-ins

2. Interface

 2.1 Layout

 2.2 Menu

 2.3 Options

3. Working with the plug-in

 3.1 How to Add Data

         3.1.1 Data Types

         3.1.2 Data Sources

 3.2 Plug-in Process

 3.3 How to Evaluate Results

4. Sample Use Case

 

 

1. Introduction to the plug-in

1.1. Function

The Categorization plug-in enables you to divide data into quantile-based categories or fixed-length categories.

 

1.2. Features

You can use the Categorization plug-in to divide a self-compiled data pool of numerical values into categories. The following categorization types are available:

Intervals count

Intervals length

Quantile

You have the option here of defining settings, such as the number of categories and length of intervals, and also, where the length of intervals is calculated, of specifying a minimum and maximum value.

Once the calculation is complete, the intervals calculated are displayed and accordingly visualized in the form of a chart.

 

1.3. Position in the Overall Software Package

1.3.1 Parent Module

The Categorization plug-in is part of the 2analyze module, which also contains the Confidence Interval, Distribution Fitting, ABC Analysis and Correlation Matrix plug-ins.

Categorization is available when you license the 2analyze module for SimAssist.

 

1.3.2 Links to other Plug-ins

The Categorization plug-in is linked to a total of three other plug-ins.

The first plug-in it is linked to is the Database Definition plug-in, which acts as a data source, and therefore provides the data pool to be calculated for the categorization.

With the SQL Statements plug-in, the data pool can be individually specified using SQL queries; this is also possible with the Variables plug-in.

 

2. Interface

2.1 Layout

KAt_overview_7.0_EN

Figure 1 - Layout of the Categorization plug-in

 

The Categorization plug-in is divided into three main areas. First, at the top of the plug-in window, is the menu. The available buttons are grouped in the categories data source, Mode, View, Chart, Chart Type, Axes and Pivot.

The Plug-in Menu, which is integrated into the Main-Menu is displayed, when a new instance of the plug-in is added to the project content area.

The Data Area is filled with data from the connected data source. Below, the Pivot Fields are displayed.

For further information about the Pivot Fields, check the chapter Pivot Fields.

The buttons from the categories data source, View, Chart, Chart Type, Axes and Pivot are available permanently. For more information about those buttons, check the chapter Pivot Chart.

The largest area of the plug-in window is taken up by the interval /chart area. Following calculation, the categorization intervals and associated visualization are displayed here in the form of a chart.

 

2.2 Menu

The Categorization plug-in menu is embedded into the frame application menu and varies depending on the categorization type selected (intervals count, intervals length or quantile).

 

Quantile Count / Interval Count

menü_quantil_EN

Figure 2 - Quantile Count / Interval Count

 

For the categorization types Quantile Count and Interval Count, you can specify the number of categories into which you want to divide the data pool values (see Figure 2).

In the Count field, you can either enter a numerical value manually or use the arrow icons to configure the desired number of categories.

 

Intervals Length

menü_intervalllänge_EN

Figure 3 - Interval Length

 

For the categorization type intervals length, you can specify the length of the intervals.

Here too, you can either enter a numerical value manually in the relevant field or use the arrow icons to configure the desired length (see Figure 3).

 

 

2.3 Options

In the Options in the main menu of SimAssist you can make plug-in specific settings (see chapter Options). The following options are available for the Categorization plug-in:

Option

Description

Diagram

Show end value

Specifies the default value whether the end value of an interval is shown or not. This is applied when a new plugin-in instance is created.

X-axis zooming

Specifies the default value whether zooming is allowed for the diagram's panes along their X-axes. This is applied when a new plugin-in instance is created.

Y-axis zooming

Specifies the default value whether zooming is allowed for the diagram's panes along their Y-axes. This is applied when a new plugin-in instance is created.

Grouping

Complement intervals

Specifies whether missing intervals are complemented.

Interval type

Determines how numeric values or dates are assigned to a range.

Maximum interval count

Defines the maximum interval count that can be created by the grouping.

Substring mode

Sets the direction of the substring operation when grouping alphanumeric values.

PivotChart

Orientation

Specifies the default orientation for new diagrams. This is applied when a new plug-in instance is created.

Templates

Default template

This template is applied once when creating a new instance.

 

 

3. Working with the plug-in

3.1 How to Add Data

3.1.1 Data Types

As the Categorization plug-in is used for calculation purposes for a data clustering, only numerical values can be added in the Data Source area.

 

3.1.2 Data Source

To add data to the Categorization plug-in, move the desired data representations, such as a table column, from the data area in the project window to the Row Area of the Pivot Fields.

If a new column is added to the Row Area, the old value is overwritten automatically.

 

3.2 Plug-In Process

1.The first step when using the plug-in is to add data to the plug-in. Drag the desired data representation into the plug-in window.

2.Then choose the desired categorization type. You can choose to calculate the intervals count, the intervals length or the quantile.

3.Now you can define the count - in the case of Interval Length the length - and fit the calculation to your own requirements.

4.The calculation starts automatically as a value / a column is added to the plug-in window / the row area.

5.If changes are made, the calculation can be updated via the Apply button (see figure 2/3).

 

3.3 How to Evaluate Results

Once the categorization has been calculated, the intervals and associated chart are displayed in the plug-in window area. You can use the tab bar at the top of the display to switch between the two displays (see figures 4 and 5).

The created diagrams can be configured individually with the help of the Chart-Designer (see figure 1). Detailed information about the Chart-Designer can be found in the chapter Pivot Chart.

kategorisierung_intervalle_diagramme_EN

Figure 4 - Intervals

kategorisierung_intervalle_diagramme_2_EN

Figure 5 - Chart

 

At the bottom of the intervals display, the number of values used for the calculation of the categorization is displayed.

 

 

4. Sample Use Case

The following comprises a detailed use case that aims to provide, by way of an example, an understanding of the functionality and use of the plug-in.

 

Step 1

At first a data source has to be added to the plug-in. To do so, drag a database entry into the plug-in window, or choose a single column.

If a database entry is selected, the containing columns show up in the content area of the pivot grid.

In this variant, the first column with a numeric value (in this case FLG_Passengers) is put into the Row Area of the Pivot Area (see figure 6).

If only a single column is dragged into the plug-in window, the content area of the pivot table is filled with only that column, that has to be set to the Row Area in order to start the evaluation.

tutorial_adding_data_7.0_EN

Figure 6 - Adding Data to the plug-in

 

 

Step 2

After adding data to the plug-in, the categorization type can be set in the menu. You can choose between Quantile Count, Interval Count and Interval Length.

A click on Apply updates the calculation and the diagrams (see figure 7).

menü_intervalllänge_EN

Figure 7 - Plug-in menu

 

 

Step 3

If you want to categorize another column of the data source, just drag the desired column out of the pivot-table into the Row Area.

The calculation is done automatically with the given options (see figure 8).

tutorial_adding_new_data_7.0_EN

Figure 8 - Adding new data to the plug-in

 

 


© SimPlan AG - Hanau District Court, Commercial Register (Part B) 6845 - info@simplan.de - www.simplan.de/en