Change Log

Version 0.8.1

2023-07-18

Fixed

  • The limit() method on Selection now accepts a range of integers or percentages.

  • The sample and limit() methods on Selection now correctly validate their inputs.

Version 0.8.0

2021-05-26

Changed (backwards-incompatible)

  • The to_df() method on Cube no longer returns Unclassified and Totals rows by default (can be included by using new unclassified and totals parameters).

Changed

  • The type attribute on Variable objects is now a Python Enum, though this is implemented in a way that is backwards-compatible with strings.

Added

  • Added day, month, quarter and year attributes to DateVariable and DateTimeVariable objects, which bands them to the corresponding time period for use as cube dimensions.

  • Added unclassified, totals and no_trans parameters to the to_df() method on Cube to control whether these rows are returned.

  • Added convert_index parameter to the to_df() method on Cube to control whether to automatically convert dimensions to their ‘natural’ index type.

  • Added before() and after() methods to TextVariable for querying it according to whether it occurs alphabetically before or after a given value. They have parameter allow_equal (default False) to also include records equal to the given value.

  • Added support for using < > operators with TextVariables, corresponding to the before() and after() methods.

Fixed

  • TextVariable.between() method now ignores case, and checks that start comes before end.

Version 0.7.1

2021-01-27

Fixed

  • Creating a session with a new system no longer produces an error on the first attempt. Fixed by requiring apteco-api v0.2.1, in which the last_login parameter on SessionDetails object is now optional.

Version 0.7.0

2021-01-26

Added

  • Added LimitClause, TopNClause, NPerVariableClause, NPerTableClause classes for creating selections with limits.

  • Added sample() and limit() methods to selections for applying limits to them.

  • Added cube variable statistics in new apteco.statistics module:

    • Sum

    • Mean

    • Populated

    • Min

    • Max

    • Median

    • Mode

    • Variance

    • StdDev

    • LowerQuartile

    • UpperQuartile

    • InterQuartileRange

    • CountDistinct

    • CountMode

  • Added missing() method to NumericVariable for selecting records with missing values.

Changed

  • Cubes can now have multiple measures rather than just the single default count.

  • For cubes with just a single dimension, Cube.to_df() now returns a DataFrame with a normal Index (rather than a MultiIndex).

Deprecated

  • NumericVariable.max → use NumericVariable.max_value

  • NumericVariable.min → use NumericVariable.min_value

Fixed

  • Process that validates cube dimensions and measures has been improved to catch some invalid combinations that were previously allowed, and allow some valid combinations that were previously prevented.

Version 0.6.0

2020-10-22

Added

  • Added ability to pick variables by description.

  • Added table_name property to Variable classes (alias of table.name).

  • Added is_related() method to Table.

  • Added allow_same keyword to is_ancestor(), is_descendant() & is_related() methods on Table.

  • Added <= and >= operators to Table corresponding to these is_ancestor(allow_same=True) and is_descendant(allow_same=True).

  • Added methods to TextVariable to enable querying text variables with different text match types:

    • equals()

    • contains()

    • startswith()

    • endswith()

    • between()

    • matches()

  • Added datagrid() and cube() methods to tables and selections to enable building Data Grids and Cubes directly from these.

  • Added AptecoDeprecationWarning for warning about deprecated features.

Changed

  • Simplified some Table attribute names (the old names still work but issue an AptecoDeprecationWarning):

    • singular_display_name -> singular

    • plural_display_name -> plural

    • is_default_table -> is_default

    • is_people_table -> is_people

    • child_relationship_name -> child_relationship

    • parent_relationship_name -> parent_relationship

    • has_child_tables -> has_children

  • Session.tables is now a TablesAccessor object instead of dict:

    • can pick tables by name using [] (as before)

    • can use for ... in to loop over tables

    • can use len() to get the number of tables

  • Session.variables and Table.variables is now a VariablesAccessor object instead of dict:

    • can pick variables by name or description using [] (as before for names; support for descriptions is new)

    • can use for ... in to loop over variables

    • can use len() to get the number of variables (in the system or on the table)

    • has names attribute for picking by name-only (using []) and looping over variable names (using for ... in)

    • has descs attribute for picking by description-only (using []) and looping over variable descriptions (using for ... in)

    • has descriptions attribute, which is alias of descs

  • The columns on the DataFrame returned by DataGrid.to_df() now have the data type that matches the FastStats variable for that column.

  • Variables from ancestor tables can now be used as columns on a DataGrid.

  • Variables from related tables can now be used as dimensions on a Cube.

Removed

  • Removed isin() and contains() method from Variable base class completely (had been previously deprecated to raise NotImplementedError). contains() has been implemented on TextVariable and it is planned to implement these methods for applicable variable types in future.

Fixed

  • During variables initialisation process, variables with unrecognised type now log a warning rather than raising exception (this means program execution can continue rather than stopping completely).

  • It is now possible to change the table of a selection to a table that is not a direct ancestor or descendant (this previously raised an OperationError).

Version 0.5.0

2020-06-03

Added

  • Added DataGrid class for creating Data Grids (export of FastStats data).

  • Added Cube class for creating Cubes (summary of FastStats data).

  • Added to_df() method to DataGrid and Cube classes for converting these objects to a Pandas DataFrame.

Changed

  • You can now import login, login_with_password and Session, along with the new DataGrid and Cube, directly from the apteco package.

Removed

  • Removed select() method from Table and Clause classes and select() function from query module, as this was not publicly documented and the direct count() method is preferred over select().count. It was wanted to reserve the select name for other potential future functionality.

Version 0.4.0

2020-04-07

Added

  • Added the ability to build selections using the ==, !=, <, >, <=, >= comparison operators with Selector, Numeric, Text, Array, FlagArray, Date, DateTime variables, and value(s) of the matching object type, e.g. DateVariable with a Python datetime.date object. (Note: not all FastStats variable types support all comparison operators.)

  • Added DateRangeClause, TimeRangeClause, DateTimeRangeClause classes for creating selection clauses.

  • Added is_ancestor(), is_descendant(), is_same() methods to Table class for checking table relationships.

  • Added count() method to Table class to enable direct counting of empty query comprising just a table.

  • Added system_info attribute to Session class which returns FastStats system metadata as a namedtuple.

  • Added installation guide, tutorial, and reference guides for Session and Variable objects.

  • Added keywords and classifiers to project (for PyPI).

  • Added continuous integration using Azure Pipelines so tests now run automatically during development process. This includes measuring test coverage.

Changed

  • login() and login_with_password() functions now return Session object directly, instead of an intermediary Credentials object.

  • The variables dictionaries on Session and Table objects now have variable names as keys, instead of descriptions.

  • Variable classes now have table attribute which returns the Table object for the table they belong to.

  • CriteriaClause classes no longer have table parameter in signature; their table attribute is derived from variable.

  • The comparison operators on tables are now reversed so that [ancestor table] < [descendant table] is true. This is to fit with the idea of the master table as the ‘root’ table and ancestor tables as having greater precedence to child and descendant tables.

  • The user attribute on Session is now a namedtuple rather than its own User class.

  • If the master table can’t be found during session initialization, it now gives more specific error messages about what went wrong.

  • If table relations aren’t initialized correctly, it now tells you about all the cases that fail, not just the first one it finds.

Removed

  • Removed CombinedCategoriesVariable class, as its implementation didn’t cover all types of Combined Categories variables. Variables of this type have reverted to the more general SelectorVariable. It is planned to re-implement Combined Categories variable support in future.

  • Removed isin() method on variables, as it’s not applicable to all variable types. It is planned to re-implement this method for relevant variables in future.

Fixed

  • Session initialization process now loads all system tables, not just the first 10.

  • Using generators to return selector codes for building selections (with == operator) now works.

Version 0.3.2

2019-10-01

Fixed

  • Improved code syntax highlighting in the README.

Version 0.3.1

2019-10-01

Fixed

  • Set Getting Started guide as the README.

Version 0.3.0

2019-10-01

Added

  • Added DateListClause for creating selections with list of dates.

  • Added select() method to Tables class to enable counting empty queries.

Changed

  • Each variable type now has a specific class with only the attributes pertinent to it.

Version 0.2.0

2019-08-23

Added

  • Added serialize() and deserialize() methods to the Session class.

  • Added documentation (Getting Started guide and Change Log).

Version 0.1.2

2019-08-05

Fixed

  • Fixed not being able to connect to a different API host after first connection during any single Python session.

Version 0.1.1

2019-08-05

Fixed

  • Fixed isin() method on variables not working.

Version 0.1.0

2019-07-05

Added

  • Added login() and login_with_password() functions to log in to the API.

  • Added Session class for creating an API session.

  • Added Table class representing FastStats system tables.

  • Added support for accessing variables on a table using the [] operator with the variable description.

  • Added support for testing equality of tables using the == operator.

  • Added support for testing if a table is an ancestor or descendant of another using the > and < operators (respectively).

  • Added SelectorClause, CombinedCategoriesClause, NumericClause, TextClause, ArrayClause, FlagArrayClause classes for creating selection clauses.

  • Added support for creating selection clauses using the == operator on variables with str literals to set values.

  • Added isin() method on variables to select values using an iterable.

  • Added BooleanClause class to apply boolean logic to clauses (AND, OR, NOT).

  • Added support for applying boolean logic using the &, |, ~ operators on clauses.

  • Added TableClause class for changing resolve table level of clauses (ANY, THE).

  • Added support for using the * operator with a clause and a table to change the resolve table of the clause.

  • Added SubSelectionClause class for using a subselection in a selection.

  • Added Selection class for creating a selection from a query, with get_count() and set_table() methods.

  • Added select() method on clauses to create a Selection from the clause.

  • Added select() function for creating a selection using a clause.