CAIDA::Tables - General purpose table objects
use CAIDA::Traffic2::FlowCounter;
use CAIDA::Tables::Tuple_Table # Tuple_Table is only an example
$table = new CAIDA::Tables::Tuple_Table;
$counter = new CAIDA::Traffic2::FlowCounter($packets, $bytes, $flows,
$first, $latest);
$table->entry_add($src_ip, $dst_ip, $ip_protocol, $ports_ok,
$src_port, $dst_port, $counter);
$counter = $table->entry_get($src_ip, $dst_ip, $ip_protocol,
$ports_ok, $src_port, $dst_port);
$data_hash_ref = $table->data();
while (($opaque_key, $counter) = each %$data_hash_ref) {
@fields = $table->get_key_fields($opaque_key);
# Do stuff with @fields.
}
@top5 = $table->sort_by_counter_field('bytes', 5);
foreach $opaque_key (@top5) {
# Do stuff with $opaque_key
}
$size = $table->num_entries();
#Many many aggregators...
%agg_options = ('table_size' => 100);
$ip_matrix = $table->aggregate_columns(0, 1); # Avoid using.
$ip_matrix = $table->naggregate_columns(0, 1); # Avoid using.
$ip_matrix = $table->make_IP_Matrix(\%agg_options);
# IP_Matrix is only one example, see Conversion functions below.
$save_file = new FileHandle("> save_file");
$load_file = new FileHandle("< save_file");
$table->save_text($save_file, $full_count);
$table->load_text($load_file);
$table->save_binary($save_file, $full_count);
$table->load_binary($load_file);
$table->add($other_table_of_same_type);
$table->nadd($other_table_of_same_type);
$table->clear();
CAIDA::Tables is a set of containers for holding large amounts of
data, all of which are associated with a counter. The default is
to use a FlowCounter to count the numbers of bytes, packets, and
flows associated with a set of data.
The API for these tables is in Perl, but the processing backends
are in both Perl and C++. Other than adding a few extra options, the C++
backend should be transparent (except for the speed increase).
When both are available, the C++ backend will be attempted first,
and if it fails, then the Perl backend is used. (See also FORCE_C
and FORCE_PERL.)
The Perl versions exist mainly to support systems with archaic C++
compilers; they may be removed in future releases.
Each table type has its own module name. For example, to use an IP_Matrix:
use CAIDA::Tables::IP_Matrix;
All tables support a general set of member functions, and in addition
have specific tranform functions to convert from one table into
another.
The available tables (and associated keys) are:
- Tuple_Table
-
(source IP, destination IP, IP protocol, ports ok, source port,
destination port) Note: ports ok is a boolean regarding the
validity of the source and destination ports.
- IP_Table
-
(IP)
- IP_Matrix
-
(source IP, destination IP)
- Proto_Ports_Table
-
(IP protocol, ports ok, source port, destination port) Note:
ports ok is a boolean regarding the validity of the source and
destination ports.
- Port_Table
-
(port)
- Port_Matrix
-
(source port, destination port)
- Proto_Table
-
(IP protocol)
- AS_Table
-
(AS) Note: AS is a string.
- AS_Matrix
-
(source AS, destination AS) Note: AS is a string.
- Country_Table
-
(country)
- Country_Matrix
-
(source country, destination country)
- App_Table
-
(application)
- AppInfo_Table
-
(description, name, group, contrib, date, notes, reference, url)
- VPVC_Table
-
(vp/vc pair)
- Prefix_Table
-
(prefix/masklength)
- Prefix_Matrix
-
(source prefix/masklength, destination prefix/masklength)
- Length_Table
-
(length)
- DEBUG
-
When set, enables certain error messages that would otherwise be
silent.
- FORCE_C
-
When set, only allows the usage of the C++ backend.
- FORCE_PERL
-
When set, only allows the usage of the Perl backend.
These examples assume Tuple_Table as the object being used.
- new (COUNTER, OPTIONS)
-
- new (COUNTER)
-
- new ()
-
Creates a new table object whose class name specifies its key type.
For example,
new CAIDA::Tables::Tuple_Table creates a table with
tuple keys (as shown above), new CAIDA::Tables::IP_Matrix creates
a table of IP address pairs, etc. COUNTER refers to any object
that implements new() and add() member functions. COUNTER's
add() method takes another object of the same type as an argument,
adds that object to itself, and returns a reference to itself. If
COUNTER is omitted or undef, the table defaults to using FlowCounter.
NOTE: The C++ backend currently only supports using a FlowCounter
object; the use of a non-FlowCounter for COUNTER will force the
use of the Perl backend.
OPTIONS is a reference to a hash containing configuration options.
The list of options are:
- force_perl
-
Boolean. Rarely used by the user, this option forces a new table
to use the Perl backend instead of the C++ backend. See also
FORCE_PERL.
- table_size
-
Used only for the C++ backend, this specifies the size of the
underlying hash table to help optimize memory usage.
- entry_add (LIST, COUNTER)
-
Adds an entry into the table. LIST contains all the fields that
make up the table's key, as listed above. COUNTER is the counter
object specified by
new(). If there is an existing entry,
COUNTER is added to the existing counter object. Returns a reference
to entry's counter.
- entry_get (LIST)
-
Returns the counter data from the table. LIST contains all the
fields that make up the table's key, as listed above.
- data ()
-
Returns a hash reference that can be used to directly read the data
in the table, using
get_key_fields().
- get_key_fields (KEY)
-
Returns the individual fields of a particular opaque KEY (such as
that returned by
each or keys on the hash referenced by
data()'s return value).
- add (TABLE)
-
- nadd (TABLE)
-
Performs a merge of another table into the current one. Returns
a reference to the original table. TABLE must be of the same type
as the original.
nadd() is the same as add(), but it is free
to do destructive operations on TABLE. TABLE should not be used
again for anything after calling nadd().
- clear ()
-
Removes all entries from the table.
- sort_by_counter_fields (FIELD, NUMBER, ASCEND)
-
- sort_by_counter_fields (FIELD, NUMBER)
-
Returns a list of opaque keys, sorted by a specific counter field. FIELD
specifies the field by which the keys are sorted. FIELD must be
the same name as a method of the counter; for the default FlowCounter,
acceptable field names are pkts, bytes, and flows.
NOTE: The C++ backend currently only supports using FlowCounter;
thus only pkts, bytes, and flows fields are supported in
the C++ backend.
NUMBER specifies how many of the top keys should be listed. If
NUMBER is set to -1, then sort_by_keys() returns all keys. ASCEND
is a boolean which determines whether the results are sorted in
ascending order. If ASCEND is omitted or set to false, then the
results are in descending order.
- sort_by_keys (NUMBER, ASCEND)
-
- sort_by_keys (NUMBER)
-
Returns a list of opaque keys, sorted by the key fields. The sort order
is such that the first field is sorted first, the second is sorted
second, etc. NUMBER specifies how many of the top keys should be
listed. If NUMBER is set to -1, then
sort_by_keys() returns all
keys. ASCEND is a boolean which determines whether the results
are sorted in ascending order. If ASCEND is omitted or set to
false, then the results are in descending order.
- num_entries ()
-
Returns the number of entries in the table.
- load_text (FILE)
-
- load_binary (FILE)
-
Loads data into the table from FILE. FILE can either be an open
filehandle or a file name. The data in the table previous to the
load is not overwritten. Instead, the file data is added as if
via a set of
entry_add() calls. The file format is that which
is used by the save functions, although other applications (such
as crl_flow) can output in that format as well.
The return value is the string which terminates the table, which
applications might use to store useful information. An example of
this might be:
# end of Tuple Table, interface 0, vp:vc 2:134
In the case of a loading error, these functions return undef.
The file format is described here.WARNING: These functions assume that the table uses a FlowCounter
object. Use of these functions with a different counter object is
a very bad idea.
- save_text (FILE, SAVE_ALL)
-
- save_text (FILE)
-
- save_binary (FILE, SAVE_ALL)
-
- save_binary (FILE)
-
Saves table data to FILE. FILE can either be an open filehandle
or a file name. These functions are primarily meant to provide
persistent storage of the table and not for reading by humans.
However, the text format is provided for those who wish to do quick
debugging or scanning of data. SAVE_ALL is a boolean that relates
to the FlowCounter object, and when true, saves the first and
latest fields. This is, sadly, a hack, which will be removed
in the future. When SAVE_ALL is omitted, it defaults to true.
The return value is a boolean indicating whether the function
successfully saved the table.
WARNING: These functions assume that the table uses a FlowCounter
object. Use of these functions with a different counter object is
a very bad idea.
Several tables have functions to create a different table from
their own data. For example, one can turn an IP matrix (with a
source and destination IP address) into an IP table by combining
any entries with the same source IP address.
All of the following accept a reference to a hash containing optional
arguments to the function. For most tables (except make_AS_Matrix,
make_AS_Table, make_Country_Matrix and make_Country_Table),
the OPTIONS are...optional.
The only option that applies to all tables is 'table_size',
which is used with the C++ backend to specify the size of the
internal hash table.
- make_IP_Matrix (OPTIONS)
-
- make_IP_Matrix ()
-
Member function of Tuple_Table; makes an IP_Matrix.
- make_src_IP_Table (OPTIONS)
-
- make_src_IP_Table ()
-
Member function of Tuple_Table and IP_Matrix; makes an IP_Table
from the source (first) IP address.
- make_dst_IP_Table (OPTIONS)
-
- make_dst_IP_Table ()
-
Member function of Tuple_Table and IP_Matrix; makes an IP_Table
from the destination (second) IP address.
- make_Proto_Ports_Table (OPTIONS)
-
- make_Proto_Ports_Table ()
-
Member function of Tuple_Table; makes a Proto_Ports_Table.
- make_Port_Matrix (OPTIONS)
-
- make_Port_Matrix ()
-
Member function of Tuple_Table and Proto_Ports_Table; makes a
Port_Matrix. OPTIONS may include an entry whose key is 'protocol'
and whose value is a protocol number. If the protocol is set, then
the resulting Port_Matrix will only include ports used by that
protocol.
- make_src_Port_Table (OPTIONS)
-
- make_src_Port_Table ()
-
Member function of Tuple_Table, Proto_Ports_Table and Port_Matrix,
makes a Port_Table from the source (first) port. OPTIONS may
include an entry whose key is 'protocol' and whose value is a
protocol number. If the protocol is set, then the resulting
Port_Table will only include ports used by that protocol.
- make_dst_Port_Table (OPTIONS)
-
- make_dst_Port_Table ()
-
Member function of Tuple_Table and Port_Matrix; makes a Port_Table
from the destination (second) port. OPTIONS may include an entry
whose key is 'protocol' and whose value is a protocol number.
If the protocol is set, then the resulting Port_Table will only
include ports used by that protocol.
- make_Proto_Table (OPTIONS)
-
- make_Proto_Table ()
-
Member function of Tuple_Table and Proto_Ports_Table; makes a
Proto_Table.
- make_AS_Matrix (OPTIONS)
-
Member function of Tuple_Table and IP_Matrix; makes an AS_Matrix.
OPTIONS must include an entry whose key is 'as_finder' and whose
value is an ASFinder object.
- make_src_AS_Table (OPTIONS)
-
- make_src_AS_Table ()
-
Member function of Tuple_Table, IP_Matrix and AS_Matrix; makes an
AS_Table. For Tuple_Table and IP_Matrix, OPTIONS must include an
entry whose key is 'as_finder' and whose value is an ASFinder
object.
- make_dst_AS_Table (OPTIONS)
-
- make_dst_AS_Table ()
-
Member function of Tuple_Table, IP_Matrix and AS_Matrix; makes an
AS_Table. For Tuple_Table and IP_Matrix, OPTIONS must include an
entry whose key is 'as_finder' and whose value is an ASFinder
object.
- make_AS_Table (OPTIONS)
-
Member function of IP_Table; makes an AS_Table. OPTIONS must
include an entry whose key is 'as_finder' and whose value is an
ASFinder object.
- make_Country_Matrix (OPTIONS)
-
Member function of Tuple_Table, IP_Matrix and AS_Matrix; makes a
Country_Matrix. OPTIONS must include an entry whose key is
'netgeo' and whose value is a NetGeo or NetGeoClient object.
OPTIONS must include an entry whose key is 'netgeo' and whose
value is a NetGeo or NetGeoClient object. In addition, Tuple_Table
and IP_Matrix require that OPTIONS also include an entry whose key
is 'as_finder' and whose value is an ASFinder object.
- make_src_Country_Table (OPTIONS)
-
- make_src_Country_Table ()
-
Member function of Tuple_Table, IP_Matrix, AS_Matrix and Country_Matrix;
makes a Country_Table. For Tuple_Table, IP_Matrix, and AS_Matrix,
OPTIONS must include an entry whose key is 'netgeo' and whose
value is a NetGeo or NetGeoClient object. In addition, Tuple_Table
and IP_Matrix require that OPTIONS also include an entry whose key
is 'as_finder' and whose value is an ASFinder object.
- make_dst_Country_Table (OPTIONS)
-
- make_dst_Country_Table ()
-
Member function of Tuple_Table, IP_Matrix, AS_Matrix and Country_Matrix;
makes a Country_Table. For Tuple_Table, IP_Matrix, and AS_Matrix,
OPTIONS must include an entry whose key is 'netgeo' and whose
value is a NetGeo or NetGeoClient object. In addition, Tuple_Table
and IP_Matrix require that OPTIONS also include an entry whose key
is 'as_finder' and whose value is an ASFinder object.
- make_Country_Table (OPTIONS)
-
Member function of IP_Table and AS_Table; makes a Country_Table.
OPTIONS must include an entry whose key is 'netgeo' and whose
value is a NetGeo or NetGeoClient object. In addition, IP_Table
requires that OPTIONS also include an entry whose key is 'as_finder'
and whose value is an ASFinder object.
- LD_LIBRARY_PATH
-
If using the C++ backend, it might be necessary to set your
LD_LIBRARY_PATH to find the proper libraries, such as libstdc++.
The FlowCounter manpage.
There are three additional tables (only in Perl) which are meant
to be used only for the creation of new tables. They are called
Generic::Split, Generic::SingleKey, and Generic::Pack (within the
CAIDA::Tables namespace). They are not meant for the general user,
but anyone wishing to create a new table can use them as a starting
point.
Because the text file format used by load_text and save_text
is tab-delimited, tab characters cannot be used in any keys if the
text format is used. However, they are allowable if only load_binary
and save_binary are used.
Ryan Koga <rkoga@caida.org>, David Moore <dmoore@caida.org>