NAME

CAIDA::Tables - General purpose table objects


SYNOPSIS

    use CAIDA::Traffic2::FlowCounter;
    use CAIDA::Tables::Tuple_Table  # Tuple_Table is only an example
    $table = new CAIDA::Tables::Tuple_Table;
    $counter = new CAIDA::Traffic2::FlowCounter($packets, $bytes, $flows,
                                                $first, $latest);
    $table->entry_add($src_ip, $dst_ip, $ip_protocol, $ports_ok,
                      $src_port, $dst_port, $counter);
    $counter = $table->entry_get($src_ip, $dst_ip, $ip_protocol,
                                    $ports_ok, $src_port, $dst_port);
    $data_hash_ref = $table->data();
    while (($opaque_key, $counter) = each %$data_hash_ref) {
        @fields = $table->get_key_fields($opaque_key);
        # Do stuff with @fields.
    }
    @top5 = $table->sort_by_counter_field('bytes', 5);
    foreach $opaque_key (@top5) {
        # Do stuff with $opaque_key
    }
    $size = $table->num_entries();
    #Many many aggregators...
    %agg_options = ('table_size' => 100);
    $ip_matrix = $table->aggregate_columns(0, 1); # Avoid using.
    $ip_matrix = $table->naggregate_columns(0, 1); # Avoid using.
    $ip_matrix = $table->make_IP_Matrix(\%agg_options);
    # IP_Matrix is only one example, see Conversion functions below.
    $save_file = new FileHandle("> save_file");
    $load_file = new FileHandle("< save_file");
    $table->save_text($save_file, $full_count);
    $table->load_text($load_file);
    $table->save_binary($save_file, $full_count);
    $table->load_binary($load_file);
    $table->add($other_table_of_same_type);
    $table->nadd($other_table_of_same_type);
    $table->clear();


DESCRIPTION

CAIDA::Tables is a set of containers for holding large amounts of data, all of which are associated with a counter. The default is to use a FlowCounter to count the numbers of bytes, packets, and flows associated with a set of data.

The API for these tables is in Perl, but the processing backends are in both Perl and C++. Other than adding a few extra options, the C++ backend should be transparent (except for the speed increase). When both are available, the C++ backend will be attempted first, and if it fails, then the Perl backend is used. (See also FORCE_C and FORCE_PERL.)

The Perl versions exist mainly to support systems with archaic C++ compilers; they may be removed in future releases.

Each table type has its own module name. For example, to use an IP_Matrix:

    use CAIDA::Tables::IP_Matrix;

All tables support a general set of member functions, and in addition have specific tranform functions to convert from one table into another.

The available tables (and associated keys) are:

Tuple_Table
(source IP, destination IP, IP protocol, ports ok, source port, destination port) Note: ports ok is a boolean regarding the validity of the source and destination ports.

IP_Table
(IP)

IP_Matrix
(source IP, destination IP)

Proto_Ports_Table
(IP protocol, ports ok, source port, destination port) Note: ports ok is a boolean regarding the validity of the source and destination ports.

Port_Table
(port)

Port_Matrix
(source port, destination port)

Proto_Table
(IP protocol)

AS_Table
(AS) Note: AS is a string.

AS_Matrix
(source AS, destination AS) Note: AS is a string.

Country_Table
(country)

Country_Matrix
(source country, destination country)

App_Table
(application)

AppInfo_Table
(description, name, group, contrib, date, notes, reference, url)

VPVC_Table
(vp/vc pair)

Prefix_Table
(prefix/masklength)

Prefix_Matrix
(source prefix/masklength, destination prefix/masklength)

Length_Table
(length)

Global flags

DEBUG
When set, enables certain error messages that would otherwise be silent.

FORCE_C
When set, only allows the usage of the C++ backend.

FORCE_PERL
When set, only allows the usage of the Perl backend.

Common functions

These examples assume Tuple_Table as the object being used.

new (COUNTER, OPTIONS)
new (COUNTER)
new ()
Creates a new table object whose class name specifies its key type. For example, new CAIDA::Tables::Tuple_Table creates a table with tuple keys (as shown above), new CAIDA::Tables::IP_Matrix creates a table of IP address pairs, etc. COUNTER refers to any object that implements new() and add() member functions. COUNTER's add() method takes another object of the same type as an argument, adds that object to itself, and returns a reference to itself. If COUNTER is omitted or undef, the table defaults to using FlowCounter.

NOTE: The C++ backend currently only supports using a FlowCounter object; the use of a non-FlowCounter for COUNTER will force the use of the Perl backend.

OPTIONS is a reference to a hash containing configuration options. The list of options are:

force_perl
Boolean. Rarely used by the user, this option forces a new table to use the Perl backend instead of the C++ backend. See also FORCE_PERL.

table_size
Used only for the C++ backend, this specifies the size of the underlying hash table to help optimize memory usage.

entry_add (LIST, COUNTER)
Adds an entry into the table. LIST contains all the fields that make up the table's key, as listed above. COUNTER is the counter object specified by new(). If there is an existing entry, COUNTER is added to the existing counter object. Returns a reference to entry's counter.

entry_get (LIST)
Returns the counter data from the table. LIST contains all the fields that make up the table's key, as listed above.

data ()
Returns a hash reference that can be used to directly read the data in the table, using get_key_fields().

get_key_fields (KEY)
Returns the individual fields of a particular opaque KEY (such as that returned by each or keys on the hash referenced by data()'s return value).

add (TABLE)
nadd (TABLE)
Performs a merge of another table into the current one. Returns a reference to the original table. TABLE must be of the same type as the original. nadd() is the same as add(), but it is free to do destructive operations on TABLE. TABLE should not be used again for anything after calling nadd().

clear ()
Removes all entries from the table.

sort_by_counter_fields (FIELD, NUMBER, ASCEND)
sort_by_counter_fields (FIELD, NUMBER)
Returns a list of opaque keys, sorted by a specific counter field. FIELD specifies the field by which the keys are sorted. FIELD must be the same name as a method of the counter; for the default FlowCounter, acceptable field names are pkts, bytes, and flows.

NOTE: The C++ backend currently only supports using FlowCounter; thus only pkts, bytes, and flows fields are supported in the C++ backend.

NUMBER specifies how many of the top keys should be listed. If NUMBER is set to -1, then sort_by_keys() returns all keys. ASCEND is a boolean which determines whether the results are sorted in ascending order. If ASCEND is omitted or set to false, then the results are in descending order.

sort_by_keys (NUMBER, ASCEND)
sort_by_keys (NUMBER)
Returns a list of opaque keys, sorted by the key fields. The sort order is such that the first field is sorted first, the second is sorted second, etc. NUMBER specifies how many of the top keys should be listed. If NUMBER is set to -1, then sort_by_keys() returns all keys. ASCEND is a boolean which determines whether the results are sorted in ascending order. If ASCEND is omitted or set to false, then the results are in descending order.

num_entries ()
Returns the number of entries in the table.

load_text (FILE)
load_binary (FILE)
Loads data into the table from FILE. FILE can either be an open filehandle or a file name. The data in the table previous to the load is not overwritten. Instead, the file data is added as if via a set of entry_add() calls. The file format is that which is used by the save functions, although other applications (such as crl_flow) can output in that format as well.

The return value is the string which terminates the table, which applications might use to store useful information. An example of this might be:

    # end of Tuple Table, interface 0, vp:vc 2:134

In the case of a loading error, these functions return undef.

The file format is described here.

WARNING: These functions assume that the table uses a FlowCounter object. Use of these functions with a different counter object is a very bad idea.

save_text (FILE, SAVE_ALL)
save_text (FILE)
save_binary (FILE, SAVE_ALL)
save_binary (FILE)
Saves table data to FILE. FILE can either be an open filehandle or a file name. These functions are primarily meant to provide persistent storage of the table and not for reading by humans. However, the text format is provided for those who wish to do quick debugging or scanning of data. SAVE_ALL is a boolean that relates to the FlowCounter object, and when true, saves the first and latest fields. This is, sadly, a hack, which will be removed in the future. When SAVE_ALL is omitted, it defaults to true.

The return value is a boolean indicating whether the function successfully saved the table.

WARNING: These functions assume that the table uses a FlowCounter object. Use of these functions with a different counter object is a very bad idea.

Conversion functions

Several tables have functions to create a different table from their own data. For example, one can turn an IP matrix (with a source and destination IP address) into an IP table by combining any entries with the same source IP address.

All of the following accept a reference to a hash containing optional arguments to the function. For most tables (except make_AS_Matrix, make_AS_Table, make_Country_Matrix and make_Country_Table), the OPTIONS are...optional.

The only option that applies to all tables is 'table_size', which is used with the C++ backend to specify the size of the internal hash table.

make_IP_Matrix (OPTIONS)
make_IP_Matrix ()
Member function of Tuple_Table; makes an IP_Matrix.

make_src_IP_Table (OPTIONS)
make_src_IP_Table ()
Member function of Tuple_Table and IP_Matrix; makes an IP_Table from the source (first) IP address.

make_dst_IP_Table (OPTIONS)
make_dst_IP_Table ()
Member function of Tuple_Table and IP_Matrix; makes an IP_Table from the destination (second) IP address.

make_Proto_Ports_Table (OPTIONS)
make_Proto_Ports_Table ()
Member function of Tuple_Table; makes a Proto_Ports_Table.

make_Port_Matrix (OPTIONS)
make_Port_Matrix ()
Member function of Tuple_Table and Proto_Ports_Table; makes a Port_Matrix. OPTIONS may include an entry whose key is 'protocol' and whose value is a protocol number. If the protocol is set, then the resulting Port_Matrix will only include ports used by that protocol.

make_src_Port_Table (OPTIONS)
make_src_Port_Table ()
Member function of Tuple_Table, Proto_Ports_Table and Port_Matrix, makes a Port_Table from the source (first) port. OPTIONS may include an entry whose key is 'protocol' and whose value is a protocol number. If the protocol is set, then the resulting Port_Table will only include ports used by that protocol.

make_dst_Port_Table (OPTIONS)
make_dst_Port_Table ()
Member function of Tuple_Table and Port_Matrix; makes a Port_Table from the destination (second) port. OPTIONS may include an entry whose key is 'protocol' and whose value is a protocol number. If the protocol is set, then the resulting Port_Table will only include ports used by that protocol.

make_Proto_Table (OPTIONS)
make_Proto_Table ()
Member function of Tuple_Table and Proto_Ports_Table; makes a Proto_Table.

make_AS_Matrix (OPTIONS)
Member function of Tuple_Table and IP_Matrix; makes an AS_Matrix. OPTIONS must include an entry whose key is 'as_finder' and whose value is an ASFinder object.

make_src_AS_Table (OPTIONS)
make_src_AS_Table ()
Member function of Tuple_Table, IP_Matrix and AS_Matrix; makes an AS_Table. For Tuple_Table and IP_Matrix, OPTIONS must include an entry whose key is 'as_finder' and whose value is an ASFinder object.

make_dst_AS_Table (OPTIONS)
make_dst_AS_Table ()
Member function of Tuple_Table, IP_Matrix and AS_Matrix; makes an AS_Table. For Tuple_Table and IP_Matrix, OPTIONS must include an entry whose key is 'as_finder' and whose value is an ASFinder object.

make_AS_Table (OPTIONS)
Member function of IP_Table; makes an AS_Table. OPTIONS must include an entry whose key is 'as_finder' and whose value is an ASFinder object.

make_Country_Matrix (OPTIONS)
Member function of Tuple_Table, IP_Matrix and AS_Matrix; makes a Country_Matrix. OPTIONS must include an entry whose key is 'netgeo' and whose value is a NetGeo or NetGeoClient object. OPTIONS must include an entry whose key is 'netgeo' and whose value is a NetGeo or NetGeoClient object. In addition, Tuple_Table and IP_Matrix require that OPTIONS also include an entry whose key is 'as_finder' and whose value is an ASFinder object.

make_src_Country_Table (OPTIONS)
make_src_Country_Table ()
Member function of Tuple_Table, IP_Matrix, AS_Matrix and Country_Matrix; makes a Country_Table. For Tuple_Table, IP_Matrix, and AS_Matrix, OPTIONS must include an entry whose key is 'netgeo' and whose value is a NetGeo or NetGeoClient object. In addition, Tuple_Table and IP_Matrix require that OPTIONS also include an entry whose key is 'as_finder' and whose value is an ASFinder object.

make_dst_Country_Table (OPTIONS)
make_dst_Country_Table ()
Member function of Tuple_Table, IP_Matrix, AS_Matrix and Country_Matrix; makes a Country_Table. For Tuple_Table, IP_Matrix, and AS_Matrix, OPTIONS must include an entry whose key is 'netgeo' and whose value is a NetGeo or NetGeoClient object. In addition, Tuple_Table and IP_Matrix require that OPTIONS also include an entry whose key is 'as_finder' and whose value is an ASFinder object.

make_Country_Table (OPTIONS)
Member function of IP_Table and AS_Table; makes a Country_Table. OPTIONS must include an entry whose key is 'netgeo' and whose value is a NetGeo or NetGeoClient object. In addition, IP_Table requires that OPTIONS also include an entry whose key is 'as_finder' and whose value is an ASFinder object.


ERRORS


EXAMPLES


ENVIRONMENT

LD_LIBRARY_PATH
If using the C++ backend, it might be necessary to set your LD_LIBRARY_PATH to find the proper libraries, such as libstdc++.


SEE ALSO

The FlowCounter manpage.


NOTES

There are three additional tables (only in Perl) which are meant to be used only for the creation of new tables. They are called Generic::Split, Generic::SingleKey, and Generic::Pack (within the CAIDA::Tables namespace). They are not meant for the general user, but anyone wishing to create a new table can use them as a starting point.


WARNINGS


DIAGNOSTICS


BUGS


RESTRICTIONS

Because the text file format used by load_text and save_text is tab-delimited, tab characters cannot be used in any keys if the text format is used. However, they are allowable if only load_binary and save_binary are used.


AUTHORS

Ryan Koga <rkoga@caida.org>, David Moore <dmoore@caida.org>