Tables¶
Introduction¶
The Table
object represents a FastStats system table.
It holds metadata about the FastStats table,
and provides access to all its variables.
It can also be used to help specify what records are included
in a selection, data grid or cube.
Table objects for a FastStats system are automatically created
when you initialize a Session
connected to that system.
Basic use¶
Retrieving a table from a Session
object:
>>> bookings = my_session.tables["Bookings"]
Examining table metadata:
>>> print(
... f"There are {bookings.total_records:,} {bookings.plural.lower()}"
... f" in this system."
... )
...
There are 2,130,081 bookings in this system.
Accessing variables:
>>> cost = bookings["Cost"]
>>> len(bookings.variables)
14
Checking table relationships:
>>> households = my_session.tables["Households"]
>>> households.is_ancestor(bookings)
True
API Reference¶
General table properties & metadata¶
These attributes on the Table
object
-
Table.
name
: str¶ table reference name
-
Table.
singular
: str¶ noun for a single item from this table
-
Table.
plural
: str¶ noun for multiple items from this table
-
Table.
is_default
: bool¶ whether this is the default table for this FastStats system
-
Table.
is_people
: bool¶ whether this is the table representing people in this FastStats system
-
Table.
total_records
: int¶ total number of records on this table
Relationships with other tables¶
These attributes on the Table
object hold data
about this table’s related tables.
‘Related’ tables in this context covers:
parent, children, ancestors (this includes the parent),
descendants (this includes children).
-
Table.
child_relationship
: str¶ phrase to relate to this table from its parent, e.g. ‘customer <purchased the> product’
-
Table.
parent_relationship
: str¶ phrase to relate this table to its parent, e.g. ‘product <was purchased by the> customer’
-
Table.
has_children
: bool¶ whether this table has any child tables
-
Table.
parent_name
: str¶ name of this table’s parent table (an empty string for the master table)
-
Table.
parent
: Table¶ the parent table of this table (
None
for the master table)
-
Table.
children
: list[Table]¶ list of child tables of this table (an empty list if table has no children)
-
Table.
ancestors
: list[Table]¶ list of ancestor tables of this table (an empty list for the master table)
-
Table.
descendants
: list[Table]¶ list of descendant tables of this table (an empty list if table has no children)
Table comparison methods¶
These methods on the Table
object enable you to compare it
with another table to check how they are related, if at all.
-
Table.
is_same
(other)¶ Return whether this table is the same as other.
This comparison can also be performed using the
==
operator.
-
Table.
is_ancestor
(other, allow_same=False)¶ Return whether this table is an ancestor of other (the direct parent table also counts as an ancestor). If allow_same is set to True, this will also return True if the tables are the same.
This comparison can also be performed using the
<
operator, or<=
for allow_same=True.
-
Table.
is_descendant
(other, allow_same=False)¶ Return whether this table is a descendant of other (direct child tables also count as descendants). If allow_same is set to True, this will also return True if the tables are the same.
This comparison can also be performed using the
>
operator, or>=
for allow_same=True.
Return whether this table is either an ancestor or descendant of other. If allow_same is set to True, this will also return True if the tables are the same.
Note
In one sense, all the tables in a FastStats system are related, since they are all descended from a single master table. However, ‘related’ here is referring to direct ancestor or direct descendant relationships, not including indirect ‘sibling’ or ‘cousin’ relationships.
Table variables¶
The variables on the table can be accessed through the
variables
attribute.
This single object provides both a list-like and dict-like
interface for accessing variables.
Variables can be retrieved by indexing into this object
(using the []
operator)
with either the variable name or description:
>>> cost = bookings.variables["Cost"]
>>> destination = bookings.variables["boDest"]
This will raise a KeyError
if the string you pass in
doesn’t match a variable name or description.
There is also a shortcut for this by using the []
operator
directly on the table itself:
>>> cost = bookings["Cost"]
>>> destination = bookings["boDest"]
You can be explicit about picking by name or description
by using the names
or descs
attribute
on variables
:
>>> travel_date = bookings.variables.names["boTrav"]
>>> profit = bookings.variables.descs["Profit"]
This will similarly raise a KeyError
if the lookup string is not recognised.
This includes if you pass a valid variable description
to variables.names
and vice-versa.
The built-in len()
function will give
the number of variables on the table:
>>> len(bookings.variables)
14
You can iterate over the variables:
>>> for var in bookings.variables:
... if var.type == "Numeric":
... print(var.description)
...
Cost
Profit
You can also iterate over the variable names or descriptions:
>>> [n for n in bookings.variables.names if not n.startswith("bo")]
['deType', 'deGrade', 'deMgr', 'deFacil']
>>> [d for d in bookings.variables.descs if "date" in d.lower()]
['Booking Date', 'Travel Date', 'Busy dates']
Note
Iterating over variables
returns the Variable
objects, whereas iterating over names
or descs
returns the names or descriptions as strings.
See also
Refer to the Variables documentation for more details on using variable objects.
Data Grids and Cubes¶
-
datagrid
(columns, selection=None, max_rows=1000)¶ Build a data grid with this table as the resolve table.
>>> cols = ( [people[var] for var in ("Initial", "Surname")] + [bookings[var] for var in ("boDate", "boCost", "boDest")] ) >>> northern = households["Region"] == ["01", "02", "13"] >>> datagrid = bookings.datagrid(cols, northern, max_rows=100) >>> datagrid.to_df().head() Initial Surname Booking Date Cost Destination 0 A Allen 2020-08-11 551.81 France 1 W Livesey 2021-08-02 1167.57 Sierra Leone 2 W Livesey 2021-08-19 562.56 United States 3 W Livesey 2021-08-08 960.55 Australia 4 O Robinson 2021-08-22 455.60 United States
See also
This method is a wrapper around the
DataGrid
class. Refer to the Data Grid documentation for more details.
-
cube
(dimensions, measures=None, selection=None)¶ Build a cube with this table as the resolve table.
>>> cube = bookings.cube( [people["Occupation"], bookings["Product"]], selection=(bookings["Cost"] > 200), ) >>> df = cube.to_df() >>> df.unstack().rename(columns=lambda x: x.split(" ")[0]) Bookings Product Accommodation Flight Package Occupation Director 1714 8477 24585 Manager 4422 28566 109725 Manual Worker 4039 27104 77547 Professional 1806 9728 40072 Public Sector 18308 82437 249637 Retail Worker 9864 30853 126350 Retired 12750 47333 86594 Sales Executive 35214 152911 407288 Student 6553 27665 145156 Unemployed 8999 30648 57211
See also
This method is a wrapper around the
Cube
class. Refer to the Cube documentation for more details.