Main manual

1 NAME
2 SYNOPSIS
3 DESCRIPTION

3.1 Positional arguments
3.2 Optional arguments
3.3 Errors

4 SUPPORTED FORMATS

4.1 Comma-separated values (.csv)
4.2 Attribute-Relation File Format (.arff)
4.3 C4.5 File Format (.data .names)
4.4 Burmeister (.cxt)
4.5 FIMI File Format (.dat)
4.6 DTL File Format (.dtl)

5 EXAMPLES

5.1 Sample data
5.2 Convert

5.2.1 Simple Conversion
5.2.2 Types specification
5.2.3 Classes specification
5.2.4 Types, classes specification
5.2.5 Object names and attribute names specification
5.2.6 Scale
5.2.7 Scale and object names specification
5.2.8 Scale and classes specification
5.2.9 Scale, object names and attribute names specification

5.3 Binary Unpacking
5.4 Filter
5.5 Info
5.6 Preview

6 INSTALLATION

6.1 Requirements
6.2 Download
6.3 Unpack
6.4 Finish

7 REPORTING BUGS
8 CONTRIBUTING
9 AUTHORS

9.1 Proofreaders

10 WEBSITE
11 LICENSE

NAME

Swift – Relational Data Converter

SYNOPSIS

swift-cli.py	[source]
	[-h] [-ss source_separator] [-ta target_attributes] [-i]
	[-mv missing_value] [-snh] [-tnh] [-t [target]]
	[-ts target_separator] [-to target_objects] [-n name]
	[-cls classes] [-sf {csv,arff,dat,data,cxt,dtl}]
	[-tf {csv,arff,dat,data,cxt,dtl}] [-c [rows_count]] [-p [rows_count]]
	[-sl skipped_lines] [-se] [-scs source_cls_separator]
	[-tcs target_cls_separator]
swift.py

DESCRIPTION

Swift – Relational Data Converter is a program for converting data files in six different formats. All accepted formats are text and can be converted with each other, it means 36 possible conversions. Swift is focused on working with a table data where rows represent objects (instances) and columns their attributes (properties). Beside the data conversion, program supports also:

Filtering - changing column order, changing column occurrences, skipping columns and rows.
Scaling - necessary for conversion multivalent formats (such as CSV, ARFF, DATA) to bivalent formats (such as CXT, DAT, DTL), but can be used for any conversion.
Analyzing - produces statistics of data.
Preview of data parts.

All operations work with data of any size, only limitation is a space of hard drive (not RAM).

Swift provides Command-line interface (swift-cli.py) and also Graphical user interface (swift.py), which is described in a separate GUI Manual. For a quick usage without useless reading, go to examples section, which contains examples of common use cases.

Positional arguments

source: The name of the file (program reads and processes this file) in one of the supported formats. The source must end with a valid format extension or the optional --source_format argument must be used. If the source is omitted or if it equals to "-", the program reads the input from the stdin.

Optional arguments

-h, --help

Print the help message to the stdout and exit.

-t target, --target target

The name of the file (program writes to this file) in one of the supported format. The --target must end with a valid format extension or the optional --target_format argument must be used. If the --target is omitted or if it equals to "-", the program writes the output to the stdout.

-ta target_attributes, --target_attributes target_attributes

The list of formulas separated by ";",

formulas ::= formula | (formula ";" formulas)

where formula is a definition of attribute in the target file. The formula is of the form:

formula ::= (new-names "=")? old-names ((":" type ("[" scale "]")?) | "[]")?

The first part of the formula:

(new-names "=")? old-names

where new-names and old-names are the lists of names separated by ",":

names ::= name | (name "," names)

where name is a word or an interval:

name ::= \w+ | ((\d+)? "-" (\d+)?) | "*"

has the following meaning: old-names refer to attributes in the source file by using their names (if are available) or indexes. New-names define new names of attributes used in the target file. If new-names are omitted, attributes in the target file have same names as in the source file. New-names and old-names must have the same length, otherwise error is produced.

The interval determines a range of indexes. If the lower bound is omitted, indexes range is from zero to upper bound. If the upper bound is omitted, indexes range is from lower bound to the maximum index of attribute in the source file. If both upper and lower bounds are omitted, indexes range is from zero to the maximum index of the attribute in the source file (for this case has been added alias "*", which has the same meaning as "-").

Examples (seven attributes in source file):

4-6 produces: 4, 5, 6

-5 produces: 0, 1, 2, 3, 4, 5

4- produces: 4, 5, 6, 7

- produces: 0, 1, 2, 3, 4, 5, 6, 7

* produces: 0, 1, 2, 3, 4, 5, 6, 7

The second optional part of the formula

((":" type ("[" scale "]")?) | "[]")?

is composed of the attribute type,

type ::= "n" | "e" | "s" | ("d" ("/" date_format)?)
date_format ::= "F="? "'" .+ "'"

where characters are aliases of data types:

n = numeric - all real numbers
e = enumeration (nominal) - finite set of named values
s = string - word
d = date - the default date format is ISO-8601 which combines date and time: YYYY-MM-DDThh:mm:ss. For using different date formats must be used python format codes,

optional definition of the scale and the binary unpacking.

scale ::= num_scale | enum_scale | str_scale | date_scale | bin_vals

The scale is an expression which is evaluated with a value of the attribute. Result of evaluation is True or False represented as 0 and 1.

In the numeric scale,

num_scale ::= (var op num_val) | (num_val op var) | (num_val op var op num_val)

the numeric value is of the form:

num_val ::= int ("." int?)? ("e" int)?
int ::= ("+" | "-")? \d+

Examples of numeric value: 5, -10, +47, 2.46, -2.31, +98.31, -78.4e-48, 3e8

var ::= [a-z_]+

Variable (the form above) represents a value of the scale evaluation. Operations have the same meaning as in many programming languages such as C or Python.

op ::= "<" | ">" | "<=" | ">=" | "==" | "!="

Examples of numeric scales (variable is x): x!=10.3, -5<=x<=10, 50==x

The enumeration scale

enum_scale ::= "'" \w+ "'"

is a one of enumeration values, which must be quoted. If the value of the attribute is exactly the same as the scale, the result of the scaling is True, otherwise False. Embeded quotes must be escaped with the backslash or doubled. For example to scale value 'foo' (with quotes) the expression: '\'foo\'' or '''foo''' must be used.

The string scale

str_scale ::= "'" .+ "'"

is a quoted python regular expression. If the regular expression matches any substring of the scaled value, result of scaling is True, otherwise False. Embeded quotes must be escaped with the backslash or doubled (same as an enumeration scale described above).

Example of the string scale: "foo[+-*]*bar", which matches "hellfoobar", "foo--barxyz" ... but doesn't match "foo/bar" ...

The date scale

date_scale ::= ((var op date_val) | (date_val op var) | (date_val op var op date_val))

is exactly the same expression as numeric scale, but values must be dates in valid date format.

Binary values

bin_vals ::= ("0="? "'" .* "'" ",")? "1="? "'" .+ "'"

allow to define new binary values for some bivalent attribute. The default values are 1 (True) and 0 (False). This can be useful for the conversion from multivalent format (which contains bivalent attribute, but values are different from 0 and 1) to bivalent format, without useless scaling (because values are bivalent already). Note that, this way is semantically the same as using enumeration for bivalent attribute, but use of binary values is prefered because is more time-efficient.

The alternative to the scale is a binary unpacking (total binarization) of an attribute, which can be used by "[]", written just behind the names. This creates new attribute for every single value of the old attribute. Every new attribute is of a type enumeration with the scale: value. The order of new attributes is the same as the order of values in template attribute. If binary unpacking is used for more attributes, for example for two attributes, then every new attribute of the first old attribute has lower index then any of new attribute of the second old attribute.

Example - binary unpacking of two attributes:

a x
b y
b z
a x
c y

out

1 0 0  1 0 0
0 1 0  0 1 0
0 1 0  0 0 1
1 0 0  1 0 0
0 0 1  0 1 0

Complete grammar of the --target_attributes argument:

formulas ::= formula | (formula ";" formulas)
formula ::= (names "=")? names ((":" type ("[" scale "]")?) | ("[" bin_vals "]") |"[]")?
names ::= name | (name "," names)
type ::= "n" | "e" | "s" | ("d" ("/" date_format)?)
scale ::= num_scale | enum_scale | str_scale | date_scale
name ::= \w+ | ((\d+)? "-" (\d+)?) | "*"
date_format ::= "F="? "'" .+ "'"
num_scale ::= (var op num_val) | (num_val op var) |
              (num_val op var op num_val)
enum_scale ::= "'" \w+ "'"
str_scale ::= "'" .+ "'"
date_scale ::= ((var op date_val) | (date_val op var) |
                (date_val op var op date_val))
bin_vals ::= ("0="? "'" .* "'" ",")? "1="? "'" .+ "'"
var ::= [a-zA-Z_]+
op ::= "<" | ">" | "<=" | ">=" | "==" | "!="
date_val ::= "'" .+ "'"
num_val ::= int ("." int?)? ("e" int)?
int ::= ("+" | "-")? \d+

Notes:

If the --target_attributes argument is omitted, all attributes from the source are processed.

If the --target_attributes argument beginning with "-" (e.g '-4,8,9'), the --target_attributes must be specified as: -ta='-4,8,9'.

-c [rows_count], --convert [rows_count]

Default action. Converts the source to the --target. The optional argument rows_count defines how many rows of data should be processed, with the default to be all rows of data.

-p [rows_count], --preview [rows_count]

Alternative action. Prints desired count of rows from the source table data in to the stdout. The default amount of printed rows is 20, but this value can be changed by using the rows_count optional argument.

-i, --info

Prints a statistics of the source and for its each processed attribute. Statistics is of the form:

Relation name:
Objects count:
Attributes count:
====================

name:
index:
type: string/enumeration
values appearance:
    value: occurrences-count/total-count = %
	.
	.
	.

name:
index:
type: numeric/date
max: , min:
values appearance:
    value: occurrences-count/total-count = %
	.
	.
	.

.
.
.

The --info may be used as single action (the statistics is printed to the stdout) or in parallel with a conversion (if --target is stdout, the new file, named same as the source, but with extension .info is produced). The usage --info in parallel with conversion is useful for the saving time if both actions are required.

-sf {csv,arff,dat,data,cxt,dtl}, --source_format {csv,arff,dat,data,cxt,dtl}

Specifies the source format using the extension. If the source doesn't have the extension, the --source_format is required. If the source has the extension and --source_format is also used, the --source_format overwrites the source extension.

-tf {csv,arff,dat,data,cxt,dtl}, --target_format {csv,arff,dat,data,cxt,dtl}

Specifies the --target format using the extension. If the --target doesn't have the extension, the --target_format is required. If the --target has the extension and --target_format is also used, the --target_format overwrites the --target extension.

-cls classes, --classes classes

Selects attributes from the source, to be used as classes in the --target. It can be specified using the interval of attribute indexes, or using names of attributes, or both in combination. --classes is of the form:

classes ::= element | (element "," classes)
element = interval | key
interval ::= ((\d+)? "-" (\d+)?)
key = \w+

This argument is relevant only in the case, when the --target format is C4.5 or DTL.

-ss source_separator, --source_separator source_separator

Specifies the separator of attributes in the source (it affects only the current action). --source_separator must be specified, when the separator used in the source is a different from a default file format separator.

-ts target_separator, --target_separator target_separator

Specifies the separator of attributes in the --target (it affects only the current action). Attributes in the --target will be separated with this new value.

-scs source_classes_separator, --source_cls_separator source_classes_separator

Specifies the separator of attributes and classes in the source (it affects only the current action). --source_cls_separator must be specified, when the classes/attributes separator used in the source is a different from a default file format classes/attributes separator. This argument is relevant only in the case, when the source format is DTL.

-tcs target_classes_separator, --target_cls_separator target_classes_separator

Specifies the separator of attributes and classes in the --target (it affects only the current action). Attributes and classes in the --target will be separated with this new value. This argument is relevant only in the case, when the --target format is DTL.

-sl skip_lines, --skip_lines skip_lines

Intervals of the form


intervals ::= interval | (interval "," intervals)
interval ::= ((\d+)? "-" (\d+)?) | \d+

determine source line indices, which will be skipped in any operation.

-se, --skip_errors

Errors produced by invalid lines in the source are skipped. Program continues and skipped errors are printed to the stderr.

-n name, --name name

Specifies a new name of relation (data).

-mv missing_value, --missing_value missing_value

Specifies the value, which will be interpreted as an undefined value (None/NULL). The result of scaling the --missing_value is always False (0).

-o objects, --objects objects

The list of object names separated by comma:

objects ::= name | (name "," objects)

This argument is relevant only in the case, when --target format is Burmeister. If --objects is omitted, indexes of objects are used.

-snh source_no_header, --source_no_header source_no_header

The first row in the CSV source is interpreted as an object (not header).

-tnh target_no_header, --target_no_header target_no_header

The CSV --target will have an object on the first line (not header).

Errors

The errors below are produced by the program. Each error specification is of the form:

error_code: error_name
	error_description

When an error is raised, program ends and returns an error code.

1: Swift Unknown Error: If you get this error, please report a bug with an error message printed below. Thank you.
2: Argument Error: Some of required arguments are missing or aren't specified correctly.
3: ARFF Header Error: The syntax error in the header of the ARFF source file.
4: DATA Header Error: The syntax error in the header of the DATA source file.
5: CSV Header Error: The syntax error in the header of the CSV source file.
6: DAT header Error: The syntax error in the header of the DAT source file.
7: CXT Header Error: The syntax error in the header of the CXT source file.
8: ARFF Line Error: The syntax error in the line of the ARFF source file.
9: DATA Line Error: The syntax error in the line of the DATA source file.
10: CSV Line Error: The syntax error in the line of the CSV source file.
11: DAT Line Error: The syntax error in the line of the DAT source file.
12: CXT Line Error: The syntax error in the line of the CXT source file.
13: DTL Line Error: The syntax error in the line of the DTL source file.
14: Formula Error: The syntax error in some formula of the --target_attributes argument.
15: Formula Names Error: The count of new names and the count of old names aren't equal.
16: Sequence Error: The syntax error in a sequence (interval).
17: DATE Value Error: The invalid value of a date attribute.
18: NUMERIC Value Error: The invalid value of a numeric attribute.
19: STRING Value Error: The invalid value of a string attribute.
20: NOMINAL Value Error: The invalid value of a nominal (enumeration) attribute.
21: DATE Error: The invalid syntax of a date.
22: Formula Date Value/Format Error: The value and the format of the date doesn't match.
23: Formula Regular Expression Error: The invalid regular expression in the formula.
24: Formula Attribute Key Error: The key of the attribute doesn't exist.
25: Keyboard Interrupt Error
26: Bivalent Error: The invalid bivalent value in data.
27: Broken Pipe Error
28: Names File Error: Files: .names and .data (C4.5 format) aren't in the same directory.
29: DTL Header Error: The syntax error in the header of the DTL source file.
30: Not Enough Lines Error: The source file doesn't have enough lines. Some required part of the file is missing or the file is empty.
31: Class Key Error: The key of the class doesn't exist.

SUPPORTED FORMATS

Swift supports following six formats: CSV, ARFF, DATA, CXT, DAT and DTL.

Comma-separated values (.csv)

The description of the format with examples can be found here. The default file format (class/attributes) separator is a comma.

Notes: The attributes separator inside the value must be always escaped by the backslash.

Attribute-Relation File Format (.arff)

The description of the format with examples can be found here. The default file format (class/attributes) separator is a comma.

Notes: The optional date format for the date attribute in the header, must be specified with python format codes (not in the Java SimpleDateFormat as specified in the official documentation).

Example:

@relation birthdays
@attribute birthday date %Y-%m-%d
@data
1990-04-23
1993-12-03
1989-03-31

When converting from an arff file with some relational attributes to some other format, relational attributes are linearized (using dot notation, see example).

Example:

@attribute humidity relational
    @attribute absolute relational
        @attribute day numeric
        @attribute night numeric
    @end absolute
@end humidity

The relational attribute humidity (above) will be converted to attributes (e.g for csv target):

humidity.absolute.day, humidity.absolute.night

C4.5 File Format (.data .names)

The description of the format with examples can be found here. The default file format (class/attributes) separator is a comma.

Notes: With a class, must be worked exactly the same as with an attribute at any conversion (class can be scaled, total binarized, ...). The key used in the --target_attributes is the class index or the name: "class".

Burmeister (.cxt)

The description of the format with examples can be found here.

FIMI File Format (.dat)

The description of the format with examples can be found here. The default file format (class/attributes) separator is a white space.

Notes: The blank lines are interpreted (from FIMI source) as objects, with all attributes of the value 0 (False). And conversely objects with all attributes of the value 0 (False), are written (to FIMI target) as blank lines. It differs from the official format documentation, which ignore blank lines and objects with all attributes of value 0.

DTL File Format (.dtl)

This format is very similar to FIMI, but it supports specification of classes for each object. The file consists of rows, which are of the form:

attributes "|" classes

where attributes part is exactly the same as attributes in the FIMI format (values are indexes of attributes with value 1) and "|" separates attributes and classes as the default class/attributes separator. The classes part consists of various values, separated with the same separator as attributes are.

Example (class1={a,b}, class2={aa,bb}):

0 1 2 3 4|a bb
1 2 3 4|a aa
2 3 4|b bb
3 4|a bb
4|b bb

With classes must be worked exactly the same as with the attributes at any conversion (class can be scaled, total binarized, ...). The key used in the --target_attributes is the class index or the name: "class1", "class2"... .

EXAMPLES

All examples is this section use following sample data.

Sample data

CSV - example.csv

name,   birth_date, credits, study, sex
George, 1991-06-13, 54,      true,  man
Monica, 1990-04-23, 98,      false, woman
Mia,    ?,          87,      true,  woman
John,   1989-11-11, 91,      true,  man

DTL - example.dtl

0 1 2 3 4|a bb
1 2 3 4|a aa
2 3 4|b bb
3 4|a bb
4|b bb

DAT - example.dat

DATA

example.names

foo, bar.
age: continuous.
job: teacher, pilot, doctor.
work: discrete 2.
sport: ignore.

example.data

44, doctor,  1, foo
30, teacher, 0, bar
35, ?,       1, foo
31, pilot,   0, foo

Convert

This section is divided into nine subsections according arguments, which are required in the particular conversion. Each of the subsections contains the list of the required arguments, one illustrative example and the list of next conversions, which can be used similarly.

For the quick navigation, you can use the following table of all possible conversions.

CSV to CSV CSV to ARFF CSV to DATA CSV to CXT CSV to DAT CSV to DTL	ARFF to ARFF ARFF to CSV ARFF to DATA ARFF to CXT ARFF to DAT ARFF to DTL	DATA to DATA DATA to ARFF DATA to CSV DATA to CXT DATA to DAT DATA to DTL
CXT to CXT CXT to DATA CXT to ARFF CXT to CSV CXT to DAT CXT to DTL	DAT to DAT DAT to CXT DAT to DATA DAT to ARFF DAT to CSV DAT to DTL	DTL to DTL DTL to DAT DTL to CXT DTL to DATA DTL to ARFF DTL to CSV

Simple Conversion

Required arguments:

source
--target / --target_format

DTL (example.dtl) to CSV

swift-cli.py example.dtl -t result.csv

result.csv


0,1,2,3,4,class1,class2
1,1,1,1,1,a,bb
0,1,1,1,1,a,aa
0,0,1,1,1,b,bb
0,0,0,1,1,a,bb
0,0,0,0,1,b,bb

Further possible conversions:

CSV to CSV
ARFF to ARFF
ARFF to CSV
DATA to CSV
DATA to ARFF
DAT to DAT
DAT to CSV
DAT to ARFF
CXT to CXT
CXT to CSV
CXT to DAT
CXT to ARFF
DTL to ARFF

Types specification

Required arguments:

source
--target / --target_format
--target_attributes (with types specification)

CSV (example.csv) to ARFF

swift-cli.py example.csv -t result.arff -mv '?' -ta "name:s; 1:d/'%Y-%m-%d'; credits:n; work,gender=3,4:e" -n people

result.arff

@relation people

@attribute name string
@attribute birth_date date %Y-%m-%d
@attribute credits numeric
@attribute work { true,false }
@attribute gender { man,woman }

@data
George,1991-06-13,54,true,man
Monica,1990-04-23,98,false,woman
Mia,?,87,true,woman
John,1989-11-11,91,true,man

Classes specification

Required arguments:

source
--target / --target_format
--classes

DATA (example.data) to DATA

swift-cli.py example.data -t result.data -cls class

result.names

foo,bar.
age: continuous.
job: teacher,pilot,doctor.
work: 1,0.
class_prev: foo,bar.

result.data

44,doctor,1,foo,foo
30,teacher,0,bar,bar
35,?,1,foo,foo
31,pilot,0,foo,foo

Further possible conversions:

ARFF to DATA
DATA to DATA
DAT to DTL
DAT to DATA
CXT to DATA
CXT to DTL
DTL to DATA

Types, classes specification

Required arguments:

source
--target / --target_format
--target_attributes (with types specification)
--classes

CSV (example.csv) to DATA

swift-cli.py example.csv -t result.data -mv '?' -ta "name:s; 1:d/'%Y-%m-%d'; credits:n; 3,4:e" -cls sex

result.names

man,woman.
name: discrete n.
birth_date: discrete n.
credits: continuous.
study: true,false.
sex: man,woman.

result.data

George,1991-06-13,54,true,man,man
Monica,1990-04-23,98,false,woman,woman
Mia,?,87,true,woman,woman
John,1989-11-11,91,true,man,man

Object names and attribute names specification

Required arguments:

source
--target / --target_format
--objects
--target_attributes (with new names, if omitted, indexes of attributes are used instead)

DAT (example.dat) to CXT

swift-cli.py example.dat -t result.cxt -o foo,bar,foobar,barfoo -ta 'a=0;b=1;c=2;d=3'

result.cxt

B

4
4
foo
bar
foobar
barfoo
a
b
c
d
X...
.X..
..X.
...X

Scale

Required arguments:

source
--target / --target_format
--target_attributes (with scaling formulas)

CSV (example.csv) to DAT

swift-cli.py example.csv -mv '?' -t result.dat -ta "name:s['M.+a']; birth_date:d/'%Y-%m-%d'[x>='1991-01-01'];
             credits:n[50<=x<=90]; study[0='false', 1='true']; sex:e['man']"

result.dat

Further possible conversions:

ARFF to DAT
DATA to DAT
DTL to DAT

Scale and object names specification

Required arguments:

source
--target / --target_format
--objects
--target_attributes (with scaling formulas)

CSV (example.csv) to CXT

swift-cli.py example.csv -mv '?' -t result.cxt -o a,b,c,d -ta "name:s['M.+a'];
	     birth_date:d/'%Y-%m-%d'[x>='1991-01-01']; credits:n[50<=x<=90]; study[0='false', 1='true']; sex:e['man']"

result.cxt

B

4
5
a
b
c
d
name
birth_date
credits
study
sex
.XXXX
X....
X.XX.
...XX

Further possible conversions:

ARFF to CXT
DATA to CXT

Scale and classes specification

Required arguments:

source
--target / --target_format
--target_attributes (with scaling formulas)
--classes

CSV (example.csv) to DTL

swift-cli.py example.csv -mv '?' -t result.dtl -cls 3,4 -ta "name:s['M.+a'];
	     birth_date:d/'%Y-%m-%d'[x>='1991-01-01']; credits:n[50<=x<=90]; study[0='false', 1='true']; sex:e['man']"

result.dtl

1 2 3 4|true man
0|false woman
0 2 3|true woman
3 4|true man

Further possible conversions:

ARFF to DTL
DATA to DTL
DTL to DTL

Scale, object names and attribute names specification

Required arguments:

source
--target / --target_format
--target_attributes (with scaling formulas and new attribute names, if names are omitted, indexes of attributes are used instead).
--objects

DTL (example.dtl) to CXT

swift-cli.py example.dtl -t result.cxt -ta "b1,b2,b3,b4,b5=0-4;class1:e['b'];class2:e['bb']" -o a,b,c,d,e

B

5
7
a
b
c
d
e
b1
b2
b3
b4
b5
class1
class2
XXXXX.X
.XXXX..
..XXXXX
...XX.X
....XXX

Binary Unpacking

CSV (example.csv) to CSV

swift-cli.py example.csv -t result.csv -ta "*[]" -tnh

result.csv

1,0,0,0,1,0,0,0,1,0,0,0,1,0,1,0
0,1,0,0,0,1,0,0,0,1,0,0,0,1,0,1
0,0,1,0,0,0,1,0,0,0,1,0,1,0,0,1
0,0,0,1,0,0,0,1,0,0,0,1,1,0,1,0

Filter

CSV (example.csv) to CSV

swift-cli.py example.csv -t result.csv -ta "study;name" -sl "2-"

study,name
true,George
false,Monica

Info

DATA (example.data)

swift-cli.py example.data -i

Relation name:
Objects count: 4
Attributes count: 4
====================

name: age
index: 0
type: numeric
max: 44.0, min: 30.0
values appearance:
    44.0: 1/4 = 25.00%
    30.0: 1/4 = 25.00%
    35.0: 1/4 = 25.00%
    31.0: 1/4 = 25.00%

name: job
index: 1
type: nominal
values appearance:
    ?: 1/4 = 25.00% (none value)
    doctor: 1/4 = 25.00%
    teacher: 1/4 = 25.00%
    pilot: 1/4 = 25.00%

name: work
index: 2
type: nominal
values appearance:
    1: 2/4 = 50.00%
    0: 2/4 = 50.00%

name: class
index: 3
type: nominal
values appearance:
    bar: 1/4 = 25.00%
    foo: 3/4 = 75.00%

Preview

DAT (example.dat)

swift-cli.py example.dat -p

INSTALLATION

Requirements

Make sure that following dependencies are installed in your computer:

Python3
Pyparsing
PyQt4 – only for graphical user interface (swift.py)

Download

This program can be obtained from the repository using git:

git clone git@github.com:gnovis/swift.git

or by the direct link.

Unpack

If you download the source through the direct link, you need to unpack the ZIP archive. For example by using the unzip program (on Linux):

cd ~/Downloads
unzip swift-master
mv swift-master swift

Finish

Go to the swift directory cd path/to/swift/ and start using swift with swift-cli.py or swift.py scripts.

REPORTING BUGS

If you find a bug, please create an issue with a description via the issue tracker.

CONTRIBUTING

You are welcome to participate in development of this project. Join us on github.

AUTHORS

Created by Jan Nováček <novacekj5@gmail.com> and Jan Outrata <jan.outrata@upol.cz>.

Proofreaders

Veronika Nováčková <veronika.novackovaa@gmail.com>

WEBSITE

The project website: http://gnovis.github.io/swift/.

LICENSE

Swift is distributed under the GNU GPL v3.