Relational API

The Relational API is an alternative API that can be used to incrementally construct queries. The API is centered around GoosePyRelation nodes. The relations can be seen as symbolic representations of SQL queries.

Lazy Evaluation

The relations do not hold any data – and nothing is executed – until a method that triggers execution is called.

For example, we create a relation, which loads 1 billion rows:

import goose

goose_conn = goose.connect()

rel = goose_conn.sql("from range(1_000_000_000)")

At the moment of execution, rel does not hold any data and no data is retrieved from the database.

By calling rel.show() or simply printing rel on the terminal, the first 10K rows are fetched. If there are more than 10K rows, the output window will show >9999 rows (as the amount of rows in the relation is unknown).

By calling an output method, the data is retrieved and stored in the specified format:

rel.to_table("example_rel")

# 100% ▕████████████████████████████████████████████████████████████▏ 

Relation Creation

This section contains the details on how a relation is created. The methods are lazy evaluated.

Name	Description
`from_arrow`	Create a relation object from an Arrow object
`from_csv_auto`	Create a relation object from the CSV file in 'name'
`from_df`	Create a relation object from the DataFrame in df
`from_parquet`	Create a relation object from the Parquet files
`from_query`	Run a SQL query. If it is a SELECT statement, create a relation object from the given SQL query, otherwise run the query as-is.
`query`	Run a SQL query. If it is a SELECT statement, create a relation object from the given SQL query, otherwise run the query as-is.
`read_csv`	Create a relation object from the CSV file in 'name'
`read_json`	Create a relation object from the JSON file in 'name'
`read_parquet`	Create a relation object from the Parquet files
`sql`	Run a SQL query. If it is a SELECT statement, create a relation object from the given SQL query, otherwise run the query as-is.
`table`	Create a relation object for the named table
`table_function`	Create a relation object from the named table function with given parameters
`values`	Create a relation object from the passed values
`view`	Create a relation object for the named view

`from_arrow`

Signature

from_arrow(self: _goose.GoosePyConnection, arrow_object: object) -> _goose.GoosePyRelation

Description

Create a relation object from an Arrow object

Parameters

arrow_object : pyarrow.Table, pyarrow.RecordBatch

Arrow object to create a relation from

Example

import goose
import pyarrow as pa

ids = pa.array([1], type=pa.int8())
texts = pa.array(['a'], type=pa.string())
example_table = pa.table([ids, texts], names=["id", "text"])

goose_conn = goose.connect()

rel = goose_conn.from_arrow(example_table)

rel.show()

Result

┌──────┬─────────┐
│  id  │  text   │
│ int8 │ varchar │
├──────┼─────────┤
│    1 │ a       │
└──────┴─────────┘

`from_csv_auto`

Signature

from_csv_auto(self: _goose.GoosePyConnection, path_or_buffer: object, **kwargs) -> _goose.GoosePyRelation

Description

Create a relation object from the CSV file in 'name'

Aliases: read_csv

Parameters

path_or_buffer : Union[str, StringIO, TextIOBase]

Path to the CSV file or buffer to read from.
header : Optional[bool], Optional[int]

Row number(s) to use as the column names, or None if no header.
compression : Optional[str]

Compression type (e.g., 'gzip', 'bz2').
sep : Optional[str]

Delimiter to use; defaults to comma.
delimiter : Optional[str]

Alternative delimiter to use.
dtype : Optional[Dict[str, str]], Optional[List[str]]

Data types for columns.
na_values : Optional[str], Optional[List[str]]

Additional strings to recognize as NA/NaN.
skiprows : Optional[int]

Number of rows to skip at the start.
quotechar : Optional[str]

Character used to quote fields.
escapechar : Optional[str]

Character used to escape delimiter or quote characters.
encoding : Optional[str]

Encoding to use for UTF when reading/writing.
parallel : Optional[bool]

Enable parallel reading.
date_format : Optional[str]

Format to parse dates.
timestamp_format : Optional[str]

Format to parse timestamps.
sample_size : Optional[int]

Number of rows to sample for schema inference.
all_varchar : Optional[bool]

Treat all columns as VARCHAR.
normalize_names : Optional[bool]

Normalize column names to lowercase.
null_padding : Optional[bool]

Enable null padding for rows with missing columns.
names : Optional[List[str]]

List of column names to use.
lineterminator : Optional[str]

Character to break lines on.
columns : Optional[Dict[str, str]]

Column mapping for schema.
auto_type_candidates : Optional[List[str]]

List of columns for automatic type inference.
max_line_size : Optional[int]

Maximum line size in bytes.
ignore_errors : Optional[bool]

Ignore parsing errors.
store_rejects : Optional[bool]

Store rejected rows.
rejects_table : Optional[str]

Table name to store rejected rows.
rejects_scan : Optional[str]

Scan to use for rejects.
rejects_limit : Optional[int]

Limit number of rejects stored.
force_not_null : Optional[List[str]]

List of columns to force as NOT NULL.
buffer_size : Optional[int]

Buffer size in bytes.
decimal : Optional[str]

Character to recognize as decimal point.
allow_quoted_nulls : Optional[bool]

Allow quoted NULL values.
filename : Optional[bool], Optional[str]

Add filename column or specify filename.
hive_partitioning : Optional[bool]

Enable Hive-style partitioning.
union_by_name : Optional[bool]

Union files by column name instead of position.
hive_types : Optional[Dict[str, str]]

Hive types for columns.
hive_types_autocast : Optional[bool]

Automatically cast Hive types.
connection : GoosePyConnection

Goose connection to use.

Example

import csv
import goose

goose_conn = goose.connect()

with open('code_example.csv', 'w', newline='') as csvfile:
    fieldnames = ['id', 'text']
    writer = csv.DictWriter(csvfile, fieldnames=fieldnames)
    writer.writeheader()
    writer.writerow({'id': '1', 'text': 'a'})

rel = goose_conn.from_csv_auto("code_example.csv")

rel.show()

Result

┌───────┬─────────┐
│  id   │  text   │
│ int64 │ varchar │
├───────┼─────────┤
│     1 │ a       │
└───────┴─────────┘

`from_df`

Signature

from_df(self: _goose.GoosePyConnection, df: pandas.DataFrame) -> _goose.GoosePyRelation

Description

Create a relation object from the DataFrame in df

Parameters

df : pandas.DataFrame

A pandas DataFrame to be converted into a Goose relation.

Example

import goose
import pandas as pd

df = pd.DataFrame(data = {'id': [1], "text":["a"]})

goose_conn = goose.connect()

rel = goose_conn.from_df(df)

rel.show()

Result

┌───────┬─────────┐
│  id   │  text   │
│ int64 │ varchar │
├───────┼─────────┤
│     1 │ a       │
└───────┴─────────┘

`from_parquet`

Signature

from_parquet(*args, **kwargs)
Overloaded function.

1. from_parquet(self: _goose.GoosePyConnection, file_glob: str, binary_as_string: bool = False, *, file_row_number: bool = False, filename: bool = False, hive_partitioning: bool = False, union_by_name: bool = False, compression: object = None) -> _goose.GoosePyRelation

Create a relation object from the Parquet files in file_glob

2. from_parquet(self: _goose.GoosePyConnection, file_globs: collections.abc.Sequence[str], binary_as_string: bool = False, *, file_row_number: bool = False, filename: bool = False, hive_partitioning: bool = False, union_by_name: bool = False, compression: object = None) -> _goose.GoosePyRelation

Create a relation object from the Parquet files in file_globs

Description

Create a relation object from the Parquet files

Aliases: read_parquet

Parameters

file_glob : str

File path or glob pattern pointing to Parquet files to be read.
binary_as_string : bool, default: False

Interpret binary columns as strings instead of blobs.
file_row_number : bool, default: False

Add a column containing the row number within each file.
filename : bool, default: False

Add a column containing the name of the file each row came from.
hive_partitioning : bool, default: False

Enable automatic detection of Hive-style partitions in file paths.
union_by_name : bool, default: False

Union Parquet files by matching column names instead of positions.
compression : object

Optional compression codec to use when reading the Parquet files.

Example

import goose
import pyarrow as pa
import pyarrow.parquet as pq

ids = pa.array([1], type=pa.int8())
texts = pa.array(['a'], type=pa.string())
example_table = pa.table([ids, texts], names=["id", "text"])

pq.write_table(example_table, "code_example.parquet")

goose_conn = goose.connect()

rel = goose_conn.from_parquet("code_example.parquet")

rel.show()

Result

┌──────┬─────────┐
│  id  │  text   │
│ int8 │ varchar │
├──────┼─────────┤
│    1 │ a       │
└──────┴─────────┘

`from_query`

Signature

from_query(self: _goose.GoosePyConnection, query: object, *, alias: str = '', params: object = None) -> _goose.GoosePyRelation

Description

Run a SQL query. If it is a SELECT statement, create a relation object from the given SQL query, otherwise run the query as-is.

Warning. Passing params to this method is discouraged due to significant performance overhead. Use execute() for parameterized queries instead.

Aliases: query, sql

Parameters

query : object

The SQL query or subquery to be executed and converted into a relation.
alias : str, default: ''

Optional alias name to assign to the resulting relation.
params : object

Optional query parameters. Discouraged due to significant performance overhead. Use execute() for parameterized queries instead.

Example

import goose

goose_conn = goose.connect()

rel = goose_conn.from_query("from range(1,2) tbl(id)")

rel.show()

Result

┌───────┐
│  id   │
│ int64 │
├───────┤
│     1 │
└───────┘

`query`

Signature

query(self: _goose.GoosePyConnection, query: object, *, alias: str = '', params: object = None) -> _goose.GoosePyRelation

Description

Run a SQL query. If it is a SELECT statement, create a relation object from the given SQL query, otherwise run the query as-is.

Warning. Passing params to this method is discouraged due to significant performance overhead. Use execute() for parameterized queries instead.

Aliases: from_query, sql

Parameters

query : object

The SQL query or subquery to be executed and converted into a relation.
alias : str, default: ''

Optional alias name to assign to the resulting relation.
params : object

Optional query parameters. Discouraged due to significant performance overhead. Use execute() for parameterized queries instead.

Example

import goose

goose_conn = goose.connect()

rel = goose_conn.query("from range(1,2) tbl(id)")

rel.show()

Result

┌───────┐
│  id   │
│ int64 │
├───────┤
│     1 │
└───────┘

`read_csv`

Signature

read_csv(self: _goose.GoosePyConnection, path_or_buffer: object, **kwargs) -> _goose.GoosePyRelation

Description

Create a relation object from the CSV file in 'name'

Aliases: from_csv_auto

Parameters

path_or_buffer : Union[str, StringIO, TextIOBase]

Path to the CSV file or buffer to read from.
header : Optional[bool], Optional[int]

Row number(s) to use as the column names, or None if no header.
compression : Optional[str]

Compression type (e.g., 'gzip', 'bz2').
sep : Optional[str]

Delimiter to use; defaults to comma.
delimiter : Optional[str]

Alternative delimiter to use.
dtype : Optional[Dict[str, str]], Optional[List[str]]

Data types for columns.
na_values : Optional[str], Optional[List[str]]

Additional strings to recognize as NA/NaN.
skiprows : Optional[int]

Number of rows to skip at the start.
quotechar : Optional[str]

Character used to quote fields.
escapechar : Optional[str]

Character used to escape delimiter or quote characters.
encoding : Optional[str]

Encoding to use for UTF when reading/writing.
parallel : Optional[bool]

Enable parallel reading.
date_format : Optional[str]

Format to parse dates.
timestamp_format : Optional[str]

Format to parse timestamps.
sample_size : Optional[int]

Number of rows to sample for schema inference.
all_varchar : Optional[bool]

Treat all columns as VARCHAR.
normalize_names : Optional[bool]

Normalize column names to lowercase.
null_padding : Optional[bool]

Enable null padding for rows with missing columns.
names : Optional[List[str]]

List of column names to use.
lineterminator : Optional[str]

Character to break lines on.
columns : Optional[Dict[str, str]]

Column mapping for schema.
auto_type_candidates : Optional[List[str]]

List of columns for automatic type inference.
max_line_size : Optional[int]

Maximum line size in bytes.
ignore_errors : Optional[bool]

Ignore parsing errors.
store_rejects : Optional[bool]

Store rejected rows.
rejects_table : Optional[str]

Table name to store rejected rows.
rejects_scan : Optional[str]

Scan to use for rejects.
rejects_limit : Optional[int]

Limit number of rejects stored.
force_not_null : Optional[List[str]]

List of columns to force as NOT NULL.
buffer_size : Optional[int]

Buffer size in bytes.
decimal : Optional[str]

Character to recognize as decimal point.
allow_quoted_nulls : Optional[bool]

Allow quoted NULL values.
filename : Optional[bool], Optional[str]

Add filename column or specify filename.
hive_partitioning : Optional[bool]

Enable Hive-style partitioning.
union_by_name : Optional[bool]

Union files by column name instead of position.
hive_types : Optional[Dict[str, str]]

Hive types for columns.
hive_types_autocast : Optional[bool]

Automatically cast Hive types.
connection : GoosePyConnection

Goose connection to use.

Example

import csv
import goose

goose_conn = goose.connect()

with open('code_example.csv', 'w', newline='') as csvfile:
    fieldnames = ['id', 'text']
    writer = csv.DictWriter(csvfile, fieldnames=fieldnames)
    writer.writeheader()
    writer.writerow({'id': '1', 'text': 'a'})

rel = goose_conn.read_csv("code_example.csv")

rel.show()

Result

┌───────┬─────────┐
│  id   │  text   │
│ int64 │ varchar │
├───────┼─────────┤
│     1 │ a       │
└───────┴─────────┘

`read_json`

Signature

read_json(self: _goose.GoosePyConnection, path_or_buffer: object, *, columns: typing.Optional[object] = None, sample_size: typing.Optional[object] = None, maximum_depth: typing.Optional[object] = None, records: typing.Optional[str] = None, format: typing.Optional[str] = None, date_format: typing.Optional[object] = None, timestamp_format: typing.Optional[object] = None, compression: typing.Optional[object] = None, maximum_object_size: typing.Optional[object] = None, ignore_errors: typing.Optional[object] = None, convert_strings_to_integers: typing.Optional[object] = None, field_appearance_threshold: typing.Optional[object] = None, map_inference_threshold: typing.Optional[object] = None, maximum_sample_files: typing.Optional[object] = None, filename: typing.Optional[object] = None, hive_partitioning: typing.Optional[object] = None, union_by_name: typing.Optional[object] = None, hive_types: typing.Optional[object] = None, hive_types_autocast: typing.Optional[object] = None) -> _goose.GoosePyRelation

Description

Create a relation object from the JSON file in 'name'

Parameters

path_or_buffer : object

File path or file-like object containing JSON data to be read.
columns : object

Optional list of column names to project from the JSON data.
sample_size : object

Number of rows to sample for inferring JSON schema.
maximum_depth : object

Maximum depth to which JSON objects should be parsed.
records : str

Format string specifying whether JSON is in records mode.
format : str

Format of the JSON data (e.g., 'auto', 'newline_delimited').
date_format : object

Format string for parsing date fields.
timestamp_format : object

Format string for parsing timestamp fields.
compression : object

Compression codec used on the JSON data (e.g., 'gzip').
maximum_object_size : object

Maximum size in bytes for individual JSON objects.
ignore_errors : object

If True, skip over JSON records with parsing errors.
convert_strings_to_integers : object

If True, attempt to convert strings to integers where appropriate.
field_appearance_threshold : object

Threshold for inferring optional fields in nested JSON.
map_inference_threshold : object

Threshold for inferring maps from JSON object patterns.
maximum_sample_files : object

Maximum number of files to sample for schema inference.
filename : object

If True, include a column with the source filename for each row.
hive_partitioning : object

If True, enable Hive partitioning based on directory structure.
union_by_name : object

If True, align JSON columns by name instead of position.
hive_types : object

If True, use Hive types from directory structure for schema.
hive_types_autocast : object

If True, automatically cast data types to match Hive types.

Example

import goose
import json

with open("code_example.json", mode="w") as f:
    json.dump([{'id': 1, "text":"a"}], f)
    
goose_conn = goose.connect()

rel = goose_conn.read_json("code_example.json")

rel.show()

Result

┌───────┬─────────┐
│  id   │  text   │
│ int64 │ varchar │
├───────┼─────────┤
│     1 │ a       │
└───────┴─────────┘

`read_parquet`

Signature

read_parquet(*args, **kwargs)
Overloaded function.

1. read_parquet(self: _goose.GoosePyConnection, file_glob: str, binary_as_string: bool = False, *, file_row_number: bool = False, filename: bool = False, hive_partitioning: bool = False, union_by_name: bool = False, compression: object = None) -> _goose.GoosePyRelation

Create a relation object from the Parquet files in file_glob

2. read_parquet(self: _goose.GoosePyConnection, file_globs: collections.abc.Sequence[str], binary_as_string: bool = False, *, file_row_number: bool = False, filename: bool = False, hive_partitioning: bool = False, union_by_name: bool = False, compression: object = None) -> _goose.GoosePyRelation

Create a relation object from the Parquet files in file_globs

Description

Create a relation object from the Parquet files

Aliases: from_parquet

Parameters

file_glob : str

File path or glob pattern pointing to Parquet files to be read.
binary_as_string : bool, default: False

Interpret binary columns as strings instead of blobs.
file_row_number : bool, default: False

Add a column containing the row number within each file.
filename : bool, default: False

Add a column containing the name of the file each row came from.
hive_partitioning : bool, default: False

Enable automatic detection of Hive-style partitions in file paths.
union_by_name : bool, default: False

Union Parquet files by matching column names instead of positions.
compression : object

Optional compression codec to use when reading the Parquet files.

Example

import goose
import pyarrow as pa
import pyarrow.parquet as pq

ids = pa.array([1], type=pa.int8())
texts = pa.array(['a'], type=pa.string())
example_table = pa.table([ids, texts], names=["id", "text"])

pq.write_table(example_table, "code_example.parquet")

goose_conn = goose.connect()

rel = goose_conn.read_parquet("code_example.parquet")

rel.show()

Result

┌──────┬─────────┐
│  id  │  text   │
│ int8 │ varchar │
├──────┼─────────┤
│    1 │ a       │
└──────┴─────────┘

`sql`

Signature

sql(self: _goose.GoosePyConnection, query: object, *, alias: str = '', params: object = None) -> _goose.GoosePyRelation

Description

Run a SQL query. If it is a SELECT statement, create a relation object from the given SQL query, otherwise run the query as-is.

Warning. Passing params to this method is discouraged due to significant performance overhead. Use execute() for parameterized queries instead.

Aliases: from_query, query

Parameters

query : object

The SQL query or subquery to be executed and converted into a relation.
alias : str, default: ''

Optional alias name to assign to the resulting relation.
params : object

Optional query parameters. Discouraged due to significant performance overhead. Use execute() for parameterized queries instead.

Example

import goose

goose_conn = goose.connect()

rel = goose_conn.sql("from range(1,2) tbl(id)")

rel.show()

Result

┌───────┐
│  id   │
│ int64 │
├───────┤
│     1 │
└───────┘

`table`

Signature

table(self: _goose.GoosePyConnection, table_name: str) -> _goose.GoosePyRelation

Description

Create a relation object for the named table

Parameters

table_name : str

Name of the table to create a relation from.

Example

import goose

goose_conn = goose.connect()

goose_conn.sql("create table code_example as select * from range(1,2) tbl(id)")

rel = goose_conn.table("code_example")

rel.show()

Result

┌───────┐
│  id   │
│ int64 │
├───────┤
│     1 │
└───────┘

`table_function`

Signature

table_function(self: _goose.GoosePyConnection, name: str, parameters: object = None) -> _goose.GoosePyRelation

Description

Create a relation object from the named table function with given parameters

Parameters

name : str

Name of the table function to call.
parameters : object

Optional parameters to pass to the table function.

Example

import goose

goose_conn = goose.connect()

goose_conn.sql("""
    create macro get_record_for(x) as table
    select x*range from range(1,2)
""")

rel = goose_conn.table_function(name="get_record_for", parameters=[1])

rel.show()

Result

┌───────────────┐
│ (1 * "range") │
│     int64     │
├───────────────┤
│             1 │
└───────────────┘

`values`

Signature

values(self: _goose.GoosePyConnection, *args) -> _goose.GoosePyRelation

Description

Create a relation object from the passed values

Example

import goose

goose_conn = goose.connect()

rel = goose_conn.values([1, 'a'])

rel.show()

Result

┌───────┬─────────┐
│ col0  │  col1   │
│ int32 │ varchar │
├───────┼─────────┤
│     1 │ a       │
└───────┴─────────┘

`view`

Signature

view(self: _goose.GoosePyConnection, view_name: str) -> _goose.GoosePyRelation

Description

Create a relation object for the named view

Parameters

view_name : str

Name of the view to create a relation from.

Example

import goose

goose_conn = goose.connect()

goose_conn.sql("create table code_example as select * from range(1,2) tbl(id)")

rel = goose_conn.view("code_example")

rel.show()

Result

┌───────┐
│  id   │
│ int64 │
├───────┤
│     1 │
└───────┘

Relation Definition Details

This section contains the details on how to inspect a relation.

Name	Description
`alias`	Get the name of the current alias
`columns`	Return a list containing the names of the columns of the relation.
`describe`	Gives basic statistics (e.g., min, max) and if NULL exists for each column of the relation.
`description`	Return the description of the result
`dtypes`	Return a list containing the types of the columns of the relation.
`explain`	explain(self: _goose.GoosePyRelation, type: _goose.ExplainType = 'standard') -> str
`query`	Run the given SQL query in sql_query on the view named virtual_table_name that refers to the relation object
`set_alias`	Rename the relation object to new alias
`shape`	Tuple of # of rows, # of columns in relation.
`show`	Display a summary of the data
`sql_query`	Get the SQL query that is equivalent to the relation
`type`	Get the type of the relation.
`types`	Return a list containing the types of the columns of the relation.

`alias`

Description

Get the name of the current alias

Example

import goose

goose_conn = goose.connect()

rel = goose_conn.sql("""
        select 
            gen_random_uuid() as id, 
            concat('value is ', case when mod(range,2)=0 then 'even' else 'uneven' end) as description,
            range as value, 
            now() + concat(range,' ', 'minutes')::interval as created_timestamp
        from range(1, 10)
    """
)

rel.alias

Result

unnamed_relation_43c808c247431be5

`columns`

Description

Return a list containing the names of the columns of the relation.

Example

import goose

goose_conn = goose.connect()

rel = goose_conn.sql("""
        select 
            gen_random_uuid() as id, 
            concat('value is ', case when mod(range,2)=0 then 'even' else 'uneven' end) as description,
            range as value, 
            now() + concat(range,' ', 'minutes')::interval as created_timestamp
        from range(1, 10)
    """
)

rel.columns

Result

 ['id', 'description', 'value', 'created_timestamp']

`describe`

Signature

describe(self: _goose.GoosePyRelation) -> _goose.GoosePyRelation

Description

Gives basic statistics (e.g., min, max) and if NULL exists for each column of the relation.

Example

import goose

goose_conn = goose.connect()

rel = goose_conn.sql("""
        select 
            gen_random_uuid() as id, 
            concat('value is ', case when mod(range,2)=0 then 'even' else 'uneven' end) as description,
            range as value, 
            now() + concat(range,' ', 'minutes')::interval as created_timestamp
        from range(1, 10)
    """
)

rel.describe()

Result

┌─────────┬──────────────────────────────────────┬─────────────────┬────────────────────┬────────────────────────────┐
│  aggr   │                  id                  │   description   │       value        │     created_timestamp      │
│ varchar │               varchar                │     varchar     │       double       │          varchar           │
├─────────┼──────────────────────────────────────┼─────────────────┼────────────────────┼────────────────────────────┤
│ count   │ 9                                    │ 9               │                9.0 │ 9                          │
│ mean    │ NULL                                 │ NULL            │                5.0 │ NULL                       │
│ stddev  │ NULL                                 │ NULL            │ 2.7386127875258306 │ NULL                       │
│ min     │ 08fdcbf8-4e53-4290-9e81-423af263b518 │ value is even   │                1.0 │ 2025-04-09 15:41:20.642+02 │
│ max     │ fb10390e-fad5-4694-91cb-e82728cb6f9f │ value is uneven │                9.0 │ 2025-04-09 15:49:20.642+02 │
│ median  │ NULL                                 │ NULL            │                5.0 │ NULL                       │
└─────────┴──────────────────────────────────────┴─────────────────┴────────────────────┴────────────────────────────┘ 

`description`

Description

Return the description of the result

Example

import goose

goose_conn = goose.connect()

rel = goose_conn.sql("""
        select 
            gen_random_uuid() as id, 
            concat('value is ', case when mod(range,2)=0 then 'even' else 'uneven' end) as description,
            range as value, 
            now() + concat(range,' ', 'minutes')::interval as created_timestamp
        from range(1, 10)
    """
)

rel.description

Result

[('id', 'UUID', None, None, None, None, None),
 ('description', 'STRING', None, None, None, None, None),
 ('value', 'NUMBER', None, None, None, None, None),
 ('created_timestamp', 'DATETIME', None, None, None, None, None)]  

`dtypes`

Description

Return a list containing the types of the columns of the relation.

Aliases: types

Example

import goose

goose_conn = goose.connect()

rel = goose_conn.sql("""
        select 
            gen_random_uuid() as id, 
            concat('value is ', case when mod(range,2)=0 then 'even' else 'uneven' end) as description,
            range as value, 
            now() + concat(range,' ', 'minutes')::interval as created_timestamp
        from range(1, 10)
    """
)

rel.dtypes

Result

 [UUID, VARCHAR, BIGINT, TIMESTAMP WITH TIME ZONE]

`explain`

Description

explain(self: _goose.GoosePyRelation, type: _goose.ExplainType = 'standard') -> str

Example

import goose

goose_conn = goose.connect()

rel = goose_conn.sql("""
        select 
            gen_random_uuid() as id, 
            concat('value is ', case when mod(range,2)=0 then 'even' else 'uneven' end) as description,
            range as value, 
            now() + concat(range,' ', 'minutes')::interval as created_timestamp
        from range(1, 10)
    """
)

rel.explain()

Result

┌───────────────────────────┐
│         PROJECTION        │
│    ────────────────────   │
│             id            │
│        description        │
│           value           │
│     created_timestamp     │
│                           │
│          ~9 Rows          │
└─────────────┬─────────────┘
┌─────────────┴─────────────┐
│           RANGE           │
│    ────────────────────   │
│      Function: RANGE      │
│                           │
│          ~9 Rows          │
└───────────────────────────┘

`query`

Signature

query(self: _goose.GoosePyRelation, virtual_table_name: str, sql_query: str) -> _goose.GoosePyRelation

Description

Run the given SQL query in sql_query on the view named virtual_table_name that refers to the relation object

Parameters

virtual_table_name : str

The name to assign to the current relation when referenced in the SQL query.
sql_query : str

The SQL query string that uses the virtual table name to query the relation.

Example

import goose

goose_conn = goose.connect()

rel = goose_conn.sql("""
        select 
            gen_random_uuid() as id, 
            concat('value is ', case when mod(range,2)=0 then 'even' else 'uneven' end) as description,
            range as value, 
            now() + concat(range,' ', 'minutes')::interval as created_timestamp
        from range(1, 10)
    """
)

rel.query(virtual_table_name="rel_view", sql_query="from rel")

goose_conn.sql("show rel_view")

Result

┌───────────────────┬──────────────────────────┬─────────┬─────────┬─────────┬─────────┐
│    column_name    │       column_type        │  null   │   key   │ default │  extra  │
│      varchar      │         varchar          │ varchar │ varchar │ varchar │ varchar │
├───────────────────┼──────────────────────────┼─────────┼─────────┼─────────┼─────────┤
│ id                │ UUID                     │ YES     │ NULL    │ NULL    │ NULL    │
│ description       │ VARCHAR                  │ YES     │ NULL    │ NULL    │ NULL    │
│ value             │ BIGINT                   │ YES     │ NULL    │ NULL    │ NULL    │
│ created_timestamp │ TIMESTAMP WITH TIME ZONE │ YES     │ NULL    │ NULL    │ NULL    │
└───────────────────┴──────────────────────────┴─────────┴─────────┴─────────┴─────────┘

`set_alias`

Signature

set_alias(self: _goose.GoosePyRelation, alias: str) -> _goose.GoosePyRelation

Description

Rename the relation object to new alias

Parameters

alias : str

The alias name to assign to the relation.

Example

import goose

goose_conn = goose.connect()

rel = goose_conn.sql("""
        select 
            gen_random_uuid() as id, 
            concat('value is ', case when mod(range,2)=0 then 'even' else 'uneven' end) as description,
            range as value, 
            now() + concat(range,' ', 'minutes')::interval as created_timestamp
        from range(1, 10)
    """
)

rel.set_alias('abc').select('abc.id')

Result

In the SQL query, the alias will be `abc`

`shape`

Description

Tuple of # of rows, # of columns in relation.

Example

import goose

goose_conn = goose.connect()

rel = goose_conn.sql("""
        select 
            gen_random_uuid() as id, 
            concat('value is ', case when mod(range,2)=0 then 'even' else 'uneven' end) as description,
            range as value, 
            now() + concat(range,' ', 'minutes')::interval as created_timestamp
        from range(1, 10)
    """
)

rel.shape

Result

(9, 4)

`show`

Signature

show(self: _goose.GoosePyRelation, *, max_width: typing.Optional[typing.SupportsInt] = None, max_rows: typing.Optional[typing.SupportsInt] = None, max_col_width: typing.Optional[typing.SupportsInt] = None, null_value: typing.Optional[str] = None, render_mode: object = None) -> None

Description

Display a summary of the data

Parameters

max_width : int

Maximum display width for the entire output in characters.
max_rows : int

Maximum number of rows to display.
max_col_width : int

Maximum number of characters to display per column.
null_value : str

String to display in place of NULL values.
render_mode : object

Render mode for displaying the output.

Example

import goose

goose_conn = goose.connect()

rel = goose_conn.sql("""
        select 
            gen_random_uuid() as id, 
            concat('value is ', case when mod(range,2)=0 then 'even' else 'uneven' end) as description,
            range as value, 
            now() + concat(range,' ', 'minutes')::interval as created_timestamp
        from range(1, 10)
    """
)

rel.show()

Result

┌──────────────────────────────────────┬─────────────────┬───────┬────────────────────────────┐
│                  id                  │   description   │ value │     created_timestamp      │
│                 uuid                 │     varchar     │ int64 │  timestamp with time zone  │
├──────────────────────────────────────┼─────────────────┼───────┼────────────────────────────┤
│ 642ea3d7-793d-4867-a759-91c1226c25a0 │ value is uneven │     1 │ 2025-04-09 15:41:20.642+02 │
│ 6817dd31-297c-40a8-8e40-8521f00b2d08 │ value is even   │     2 │ 2025-04-09 15:42:20.642+02 │
│ 45143f9a-e16e-4e59-91b2-3a0800eed6d6 │ value is uneven │     3 │ 2025-04-09 15:43:20.642+02 │
│ fb10390e-fad5-4694-91cb-e82728cb6f9f │ value is even   │     4 │ 2025-04-09 15:44:20.642+02 │
│ 111ced5c-9155-418e-b087-c331b814db90 │ value is uneven │     5 │ 2025-04-09 15:45:20.642+02 │
│ 66a870a6-aef0-4085-87d5-5d1b35d21c66 │ value is even   │     6 │ 2025-04-09 15:46:20.642+02 │
│ a7e8e796-bca0-44cd-a269-1d71090fb5cc │ value is uneven │     7 │ 2025-04-09 15:47:20.642+02 │
│ 74908d48-7f2d-4bdd-9c92-1e7920b115b5 │ value is even   │     8 │ 2025-04-09 15:48:20.642+02 │
│ 08fdcbf8-4e53-4290-9e81-423af263b518 │ value is uneven │     9 │ 2025-04-09 15:49:20.642+02 │
└──────────────────────────────────────┴─────────────────┴───────┴────────────────────────────┘

`sql_query`

Signature

sql_query(self: _goose.GoosePyRelation) -> str

Description

Get the SQL query that is equivalent to the relation

Example

import goose

goose_conn = goose.connect()

rel = goose_conn.sql("""
        select 
            gen_random_uuid() as id, 
            concat('value is ', case when mod(range,2)=0 then 'even' else 'uneven' end) as description,
            range as value, 
            now() + concat(range,' ', 'minutes')::interval as created_timestamp
        from range(1, 10)
    """
)

rel.sql_query()

Result

SELECT 
    gen_random_uuid() AS id, 
    concat('value is ', CASE  WHEN ((mod("range", 2) = 0)) THEN ('even') ELSE 'uneven' END) AS description, 
    "range" AS "value", 
    (now() + CAST(concat("range", ' ', 'minutes') AS INTERVAL)) AS created_timestamp 
FROM "range"(1, 10)

`type`

Description

Get the type of the relation.

Example

import goose

goose_conn = goose.connect()

rel = goose_conn.sql("""
        select 
            gen_random_uuid() as id, 
            concat('value is ', case when mod(range,2)=0 then 'even' else 'uneven' end) as description,
            range as value, 
            now() + concat(range,' ', 'minutes')::interval as created_timestamp
        from range(1, 10)
    """
)

rel.type

Result

QUERY_RELATION

`types`

Description

Return a list containing the types of the columns of the relation.

Aliases: dtypes

Example

import goose

goose_conn = goose.connect()

rel = goose_conn.sql("""
        select 
            gen_random_uuid() as id, 
            concat('value is ', case when mod(range,2)=0 then 'even' else 'uneven' end) as description,
            range as value, 
            now() + concat(range,' ', 'minutes')::interval as created_timestamp
        from range(1, 10)
    """
)

rel.types

Result

[UUID, VARCHAR, BIGINT, TIMESTAMP WITH TIME ZONE]

Transformation

This section contains the methods which can be used to chain queries. The methods are lazy evaluated.

Name	Description
`aggregate`	Compute the aggregate aggr_expr by the optional groups group_expr on the relation
`apply`	Compute the function of a single column or a list of columns by the optional groups on the relation
`cross`	Create cross/cartesian product of two relational objects
`except_`	Create the set except of this relation object with another relation object in other_rel
`filter`	Filter the relation object by the filter in filter_expr
`insert`	Inserts the given values into the relation
`insert_into`	Inserts the relation object into an existing table named table_name
`intersect`	Create the set intersection of this relation object with another relation object in other_rel
`join`	Join the relation object with another relation object in other_rel using the join condition expression in join_condition. Types supported are 'inner', 'left', 'right', 'outer', 'semi' and 'anti'
`limit`	Only retrieve the first n rows from this relation object, starting at offset
`map`	Calls the passed function on the relation
`order`	Reorder the relation object by order_expr
`project`	Project the relation object by the projection in project_expr
`select`	Project the relation object by the projection in project_expr
`sort`	Reorder the relation object by the provided expressions
`union`	Create the set union of this relation object with another relation object in other_rel
`update`	Update the given relation with the provided expressions

`aggregate`

Signature

aggregate(self: _goose.GoosePyRelation, aggr_expr: object, group_expr: str = '') -> _goose.GoosePyRelation

Description

Compute the aggregate aggr_expr by the optional groups group_expr on the relation

Parameters

aggr_expr : str, list[Expression]

The list of columns and aggregation functions.
group_expr : str, default: ''

The list of columns to be included in group_by. If None, group by all is applied.

Example

import goose

goose_conn = goose.connect()

rel = goose_conn.sql("""
        select 
            gen_random_uuid() as id, 
            concat('value is ', case when mod(range,2)=0 then 'even' else 'uneven' end) as description,
            range as value, 
            now() + concat(range,' ', 'minutes')::interval as created_timestamp
        from range(1, 10)
    """
)

rel = rel.aggregate('max(value)')

Result

┌──────────────┐
│ max("value") │
│    int64     │
├──────────────┤
│            9 │
└──────────────┘
        

`apply`

Signature

apply(self: _goose.GoosePyRelation, function_name: str, function_aggr: str, group_expr: str = '', function_parameter: str = '', projected_columns: str = '') -> _goose.GoosePyRelation

Description

Compute the function of a single column or a list of columns by the optional groups on the relation

Parameters

function_name : str

Name of the function to apply over the relation.
function_aggr : str

The list of columns to apply the function over.
group_expr : str, default: ''

Optional SQL expression for grouping.
function_parameter : str, default: ''

Optional parameters to pass into the function.
projected_columns : str, default: ''

Comma-separated list of columns to include in the result.

Example

import goose

goose_conn = goose.connect()

rel = goose_conn.sql("""
        select 
            gen_random_uuid() as id, 
            concat('value is ', case when mod(range,2)=0 then 'even' else 'uneven' end) as description,
            range as value, 
            now() + concat(range,' ', 'minutes')::interval as created_timestamp
        from range(1, 10)
    """
)

rel.apply(
    function_name="count", 
    function_aggr="id", 
    group_expr="description",
    projected_columns="description"
)

Result

┌─────────────────┬───────────┐
│   description   │ count(id) │
│     varchar     │   int64   │
├─────────────────┼───────────┤
│ value is uneven │         5 │
│ value is even   │         4 │
└─────────────────┴───────────┘

`cross`

Signature

cross(self: _goose.GoosePyRelation, other_rel: _goose.GoosePyRelation) -> _goose.GoosePyRelation

Description

Create cross/cartesian product of two relational objects

Parameters

other_rel : _goose.GoosePyRelation

Another relation to perform a cross product with.

Example

import goose

goose_conn = goose.connect()

rel = goose_conn.sql("""
        select 
            gen_random_uuid() as id, 
            concat('value is ', case when mod(range,2)=0 then 'even' else 'uneven' end) as description,
            range as value, 
            now() + concat(range,' ', 'minutes')::interval as created_timestamp
        from range(1, 10)
    """
)

rel.cross(other_rel=rel.set_alias("other_rel"))

Result

┌─────────────────────────────┬─────────────────┬───────┬───────────────────────────┬──────────────────────────────────────┬─────────────────┬───────┬───────────────────────────┐
│             id              │   description   │ value │     created_timestamp     │                  id                  │   description   │ value │     created_timestamp     │
│            uuid             │     varchar     │ int64 │ timestamp with time zone  │                 uuid                 │     varchar     │ int64 │ timestamp with time zone  │
├─────────────────────────────┼─────────────────┼───────┼───────────────────────────┼──────────────────────────────────────┼─────────────────┼───────┼───────────────────────────┤
│ cb2b453f-1a06-4f5e-abe1-b…  │ value is uneven │     1 │ 2025-04-10 09:53:29.78+02 │ cb2b453f-1a06-4f5e-abe1-bfd413581bcf │ value is uneven │     1 │ 2025-04-10 09:53:29.78+02 │
...

`except_`

Signature

except_(self: _goose.GoosePyRelation, other_rel: _goose.GoosePyRelation) -> _goose.GoosePyRelation

Description

Create the set except of this relation object with another relation object in other_rel

Parameters

other_rel : _goose.GoosePyRelation

The relation to subtract from the current relation (set difference).

Example

import goose

goose_conn = goose.connect()

rel = goose_conn.sql("""
        select 
            gen_random_uuid() as id, 
            concat('value is ', case when mod(range,2)=0 then 'even' else 'uneven' end) as description,
            range as value, 
            now() + concat(range,' ', 'minutes')::interval as created_timestamp
        from range(1, 10)
    """
)

rel.except_(other_rel=rel.set_alias("other_rel"))

Result

The relation query is executed twice, therefore generating different ids and timestamps:
┌──────────────────────────────────────┬─────────────────┬───────┬────────────────────────────┐
│                  id                  │   description   │ value │     created_timestamp      │
│                 uuid                 │     varchar     │ int64 │  timestamp with time zone  │
├──────────────────────────────────────┼─────────────────┼───────┼────────────────────────────┤
│ f69ed6dd-a7fe-4de2-b6af-1c2418096d69 │ value is uneven │     3 │ 2025-04-10 11:43:05.711+02 │
│ 08ad11dc-a9c2-4aaa-9272-760b27ad1f5d │ value is uneven │     7 │ 2025-04-10 11:47:05.711+02 │
...

`filter`

Signature

filter(self: _goose.GoosePyRelation, filter_expr: object) -> _goose.GoosePyRelation

Description

Filter the relation object by the filter in filter_expr

Parameters

filter_expr : str, Expression

The filter expression to apply over the relation.

Example

import goose

goose_conn = goose.connect()

rel = goose_conn.sql("""
        select 
            gen_random_uuid() as id, 
            concat('value is ', case when mod(range,2)=0 then 'even' else 'uneven' end) as description,
            range as value, 
            now() + concat(range,' ', 'minutes')::interval as created_timestamp
        from range(1, 10)
    """
)

rel.filter("value = 2")

Result

┌──────────────────────────────────────┬───────────────┬───────┬───────────────────────────┐
│                  id                  │  description  │ value │     created_timestamp     │
│                 uuid                 │    varchar    │ int64 │ timestamp with time zone  │
├──────────────────────────────────────┼───────────────┼───────┼───────────────────────────┤
│ b0684ab7-fcbf-41c5-8e4a-a51bdde86926 │ value is even │     2 │ 2025-04-10 09:54:29.78+02 │
└──────────────────────────────────────┴───────────────┴───────┴───────────────────────────┘

`insert`

Signature

insert(self: _goose.GoosePyRelation, values: object) -> None

Description

Inserts the given values into the relation

Parameters

values : object

A tuple of values matching the relation column list, to be inserted.

Example

import goose

from datetime import datetime
from uuid import uuid4

goose_conn = goose.connect()

goose_conn.sql("""
        select 
            gen_random_uuid() as id, 
            concat('value is ', case when mod(range,2)=0 then 'even' else 'uneven' end) as description,
            range as value, 
            now() + concat(range,' ', 'minutes')::interval as created_timestamp
        from range(1, 10)
    """
).to_table("code_example")

rel = goose_conn.table("code_example")

rel.insert(
    (
        uuid4(), 
        'value is even',
        10, 
        datetime.now()
    )
)

rel.filter("value = 10")

Result

┌──────────────────────────────────────┬───────────────┬───────┬───────────────────────────────┐
│                  id                  │  description  │ value │       created_timestamp       │
│                 uuid                 │    varchar    │ int64 │   timestamp with time zone    │
├──────────────────────────────────────┼───────────────┼───────┼───────────────────────────────┤
│ c6dfab87-fae6-4213-8f76-1b96a8d179f6 │ value is even │    10 │ 2025-04-10 10:02:24.652218+02 │
└──────────────────────────────────────┴───────────────┴───────┴───────────────────────────────┘

`insert_into`

Signature

insert_into(self: _goose.GoosePyRelation, table_name: str) -> None

Description

Inserts the relation object into an existing table named table_name

Parameters

table_name : str

The table name to insert the data into. The relation must respect the column order of the table.

Example

import goose

from datetime import datetime
from uuid import uuid4

goose_conn = goose.connect()

goose_conn.sql("""
        select
            gen_random_uuid() as id,
            concat('value is ', case when mod(range,2)=0 then 'even' else 'uneven' end) as description,
            range as value,
            now() + concat(range,' ', 'minutes')::interval as created_timestamp
        from range(1, 10)
    """
).to_table("code_example")

rel = goose_conn.values(
    [
        uuid4(),
        'value is even',
        10,
        datetime.now()
    ]
)

rel.insert_into("code_example")

goose_conn.table("code_example").filter("value = 10")

Result

┌──────────────────────────────────────┬───────────────┬───────┬───────────────────────────────┐
│                  id                  │  description  │ value │       created_timestamp       │
│                 uuid                 │    varchar    │ int64 │   timestamp with time zone    │
├──────────────────────────────────────┼───────────────┼───────┼───────────────────────────────┤
│ 271c5ddd-c1d5-4638-b5a0-d8c7dc9e8220 │ value is even │    10 │ 2025-04-10 14:29:18.616379+02 │
└──────────────────────────────────────┴───────────────┴───────┴───────────────────────────────┘

`intersect`

Signature

intersect(self: _goose.GoosePyRelation, other_rel: _goose.GoosePyRelation) -> _goose.GoosePyRelation

Description

Create the set intersection of this relation object with another relation object in other_rel

Parameters

other_rel : _goose.GoosePyRelation

The relation to intersect with the current relation (set intersection).

Example

import goose

goose_conn = goose.connect()

rel = goose_conn.sql("""
        select 
            gen_random_uuid() as id, 
            concat('value is ', case when mod(range,2)=0 then 'even' else 'uneven' end) as description,
            range as value, 
            now() + concat(range,' ', 'minutes')::interval as created_timestamp
        from range(1, 10)
    """
)

rel.intersect(other_rel=rel.set_alias("other_rel"))

Result

The relation query is executed once with `rel` and once with `other_rel`,
therefore generating different ids and timestamps:
┌──────┬─────────────┬───────┬──────────────────────────┐
│  id  │ description │ value │    created_timestamp     │
│ uuid │   varchar   │ int64 │ timestamp with time zone │
├──────┴─────────────┴───────┴──────────────────────────┤
│                        0 rows                         │
└───────────────────────────────────────────────────────┘

`join`

Signature

join(self: _goose.GoosePyRelation, other_rel: _goose.GoosePyRelation, condition: object, how: str = 'inner') -> _goose.GoosePyRelation

Description

Join the relation object with another relation object in other_rel using the join condition expression in join_condition. Types supported are 'inner', 'left', 'right', 'outer', 'semi' and 'anti'

Depending on how the condition parameter is provided, the JOIN clause generated is:

USING

import goose

goose_conn = goose.connect()

rel1 = goose_conn.sql("select range as id, concat('dummy 1', range) as text from range(1,10)")
rel2 = goose_conn.sql("select range as id, concat('dummy 2', range) as text from range(5,7)")

rel1.join(rel2, condition="id", how="inner").sql_query()

with following SQL:

SELECT * 
FROM (
        SELECT "range" AS id, 
            concat('dummy 1', "range") AS "text" 
        FROM "range"(1, 10)
    ) AS unnamed_relation_41bc15e744037078 
INNER JOIN (
        SELECT "range" AS id, 
        concat('dummy 2', "range") AS "text" 
        FROM "range"(5, 7)
    ) AS unnamed_relation_307e245965aa2c2b 
USING (id)

ON

import goose

goose_conn = goose.connect()

rel1 = goose_conn.sql("select range as id, concat('dummy 1', range) as text from range(1,10)")
rel2 = goose_conn.sql("select range as id, concat('dummy 2', range) as text from range(5,7)")

rel1.join(rel2, condition=f"{rel1.alias}.id = {rel2.alias}.id", how="inner").sql_query()

with the following SQL:

SELECT * 
FROM (
        SELECT "range" AS id, 
            concat('dummy 1', "range") AS "text" 
        FROM "range"(1, 10)
    ) AS unnamed_relation_41bc15e744037078 
INNER JOIN (
        SELECT "range" AS id, 
        concat('dummy 2', "range") AS "text" 
        FROM "range"(5, 7)
    ) AS unnamed_relation_307e245965aa2c2b 
ON ((unnamed_relation_41bc15e744037078.id = unnamed_relation_307e245965aa2c2b.id))

NATURAL, POSITIONAL and ASOF joins are not provided by the relational API. CROSS joins are provided through the cross method.

Parameters

other_rel : _goose.GoosePyRelation

The relation to join with the current relation.
condition : object

The join condition, typically a SQL expression or the duplicated column name to join on.
how : str, default: 'inner'

The type of join to perform: 'inner', 'left', 'right', 'outer', 'semi' and 'anti'.

Example

import goose

goose_conn = goose.connect()

rel = goose_conn.sql("""
        select 
            gen_random_uuid() as id, 
            concat('value is ', case when mod(range,2)=0 then 'even' else 'uneven' end) as description,
            range as value, 
            now() + concat(range,' ', 'minutes')::interval as created_timestamp
        from range(1, 10)
    """
)

rel = rel.set_alias("rel").join(
    other_rel=rel.set_alias("other_rel"), 
    condition="rel.id = other_rel.id",
    how="left"
)

rel.count("*")

Result

┌──────────────┐
│ count_star() │
│    int64     │
├──────────────┤
│            9 │
└──────────────┘

`limit`

Signature

limit(self: _goose.GoosePyRelation, n: typing.SupportsInt, offset: typing.SupportsInt = 0) -> _goose.GoosePyRelation

Description

Only retrieve the first n rows from this relation object, starting at offset

Parameters

n : int

The maximum number of rows to return.
offset : int, default: 0

The number of rows to skip before starting to return rows.

Example

import goose

goose_conn = goose.connect()

rel = goose_conn.sql("""
        select 
            gen_random_uuid() as id, 
            concat('value is ', case when mod(range,2)=0 then 'even' else 'uneven' end) as description,
            range as value, 
            now() + concat(range,' ', 'minutes')::interval as created_timestamp
        from range(1, 10)
    """
)

rel.limit(1)

Result

┌──────────────────────────────────────┬─────────────────┬───────┬────────────────────────────┐
│                  id                  │   description   │ value │     created_timestamp      │
│                 uuid                 │     varchar     │ int64 │  timestamp with time zone  │
├──────────────────────────────────────┼─────────────────┼───────┼────────────────────────────┤
│ 4135597b-29e7-4cb9-a443-41f3d54f25df │ value is uneven │     1 │ 2025-04-10 10:52:03.678+02 │
└──────────────────────────────────────┴─────────────────┴───────┴────────────────────────────┘

`map`

Signature

map(self: _goose.GoosePyRelation, map_function: collections.abc.Callable, *, schema: typing.Optional[object] = None) -> _goose.GoosePyRelation

Description

Calls the passed function on the relation

Parameters

map_function : Callable

A Python function that takes a DataFrame and returns a transformed DataFrame.
schema : object, default: None

Optional schema describing the structure of the output relation.

Example

import goose
from pandas import DataFrame

def multiply_by_2(df: DataFrame):
    df["id"] = df["id"] * 2
    return df

goose_conn = goose.connect()
rel = goose_conn.sql("select range as id, 'dummy' as text from range(1,3)")

rel.map(multiply_by_2, schema={"id": int, "text": str})

Result

┌───────┬─────────┐
│  id   │  text   │
│ int64 │ varchar │
├───────┼─────────┤
│     2 │ dummy   │
│     4 │ dummy   │
└───────┴─────────┘

`order`

Signature

order(self: _goose.GoosePyRelation, order_expr: str) -> _goose.GoosePyRelation

Description

Reorder the relation object by order_expr

Parameters

order_expr : str

SQL expression defining the ordering of the result rows.

Example

import goose

goose_conn = goose.connect()

rel = goose_conn.sql("""
        select 
            gen_random_uuid() as id, 
            concat('value is ', case when mod(range,2)=0 then 'even' else 'uneven' end) as description,
            range as value, 
            now() + concat(range,' ', 'minutes')::interval as created_timestamp
        from range(1, 10)
    """
)

rel.order("value desc").limit(1, offset=4)

Result

┌──────────────────────────────────────┬─────────────────┬───────┬────────────────────────────┐
│                  id                  │   description   │ value │     created_timestamp      │
│                 uuid                 │     varchar     │ int64 │  timestamp with time zone  │
├──────────────────────────────────────┼─────────────────┼───────┼────────────────────────────┤
│ 55899131-e3d3-463c-a215-f65cb8aef3bf │ value is uneven │     5 │ 2025-04-10 10:56:03.678+02 │
└──────────────────────────────────────┴─────────────────┴───────┴────────────────────────────┘

`project`

Signature

project(self: _goose.GoosePyRelation, *args, groups: str = '') -> _goose.GoosePyRelation

Description

Project the relation object by the projection in project_expr

Aliases: select

Parameters

groups : str, default: ''

Comma-separated list of columns to include in the group by.

Example

import goose

goose_conn = goose.connect()

rel = goose_conn.sql("""
        select 
            gen_random_uuid() as id, 
            concat('value is ', case when mod(range,2)=0 then 'even' else 'uneven' end) as description,
            range as value, 
            now() + concat(range,' ', 'minutes')::interval as created_timestamp
        from range(1, 10)
    """
)

rel.project("description").limit(1)

Result

┌─────────────────┐
│   description   │
│     varchar     │
├─────────────────┤
│ value is uneven │
└─────────────────┘

`select`

Signature

select(self: _goose.GoosePyRelation, *args, groups: str = '') -> _goose.GoosePyRelation

Description

Project the relation object by the projection in project_expr

Aliases: project

Parameters

groups : str, default: ''

Comma-separated list of columns to include in the group by.

Example

import goose

goose_conn = goose.connect()

rel = goose_conn.sql("""
        select 
            gen_random_uuid() as id, 
            concat('value is ', case when mod(range,2)=0 then 'even' else 'uneven' end) as description,
            range as value, 
            now() + concat(range,' ', 'minutes')::interval as created_timestamp
        from range(1, 10)
    """
)

rel.select("description").limit(1)

Result

┌─────────────────┐
│   description   │
│     varchar     │
├─────────────────┤
│ value is uneven │
└─────────────────┘

`sort`

Signature

sort(self: _goose.GoosePyRelation, *args) -> _goose.GoosePyRelation

Description

Reorder the relation object by the provided expressions

Example

import goose

goose_conn = goose.connect()

rel = goose_conn.sql("""
        select 
            gen_random_uuid() as id, 
            concat('value is ', case when mod(range,2)=0 then 'even' else 'uneven' end) as description,
            range as value, 
            now() + concat(range,' ', 'minutes')::interval as created_timestamp
        from range(1, 10)
    """
)

rel.sort("description")

Result

┌──────────────────────────────────────┬─────────────────┬───────┬────────────────────────────┐
│                  id                  │   description   │ value │     created_timestamp      │
│                 uuid                 │     varchar     │ int64 │  timestamp with time zone  │
├──────────────────────────────────────┼─────────────────┼───────┼────────────────────────────┤
│ 5e0dfa8c-de4d-4ccd-8cff-450dabb86bde │ value is even   │     6 │ 2025-04-10 16:52:15.605+02 │
│ 95f1ad48-facf-4a84-a971-0a4fecce68c7 │ value is even   │     2 │ 2025-04-10 16:48:15.605+02 │
...

`union`

Signature

union(self: _goose.GoosePyRelation, union_rel: _goose.GoosePyRelation) -> _goose.GoosePyRelation

Description

Create the set union of this relation object with another relation object in other_rel

The union is union all. In order to retrieve distinct values, apply distinct.

Parameters

union_rel : _goose.GoosePyRelation

The relation to union with the current relation (set union).

Example

import goose

goose_conn = goose.connect()

rel = goose_conn.sql("""
        select 
            gen_random_uuid() as id, 
            concat('value is ', case when mod(range,2)=0 then 'even' else 'uneven' end) as description,
            range as value, 
            now() + concat(range,' ', 'minutes')::interval as created_timestamp
        from range(1, 10)
    """
)

rel = rel.union(union_rel=rel)

rel.count("*")

Result

┌──────────────┐
│ count_star() │
│    int64     │
├──────────────┤
│           18 │
└──────────────┘

`update`

Signature

update(self: _goose.GoosePyRelation, set: object, *, condition: object = None) -> None

Description

Update the given relation with the provided expressions

Parameters

set : object

Mapping of columns to new values for the update operation.
condition : object, default: None

Optional condition to filter which rows to update.

Example

import goose

from goose import ColumnExpression

goose_conn = goose.connect()

goose_conn.sql("""
        select 
            gen_random_uuid() as id, 
            concat('value is ', case when mod(range,2)=0 then 'even' else 'uneven' end) as description,
            range as value, 
            now() + concat(range,' ', 'minutes')::interval as created_timestamp
        from range(1, 10)
    """
).to_table("code_example")

rel = goose_conn.table("code_example")

rel.update(set={"description":None}, condition=ColumnExpression("value") == 1)

# the update is executed on the table, but not reflected on the relationship
# the relationship has to be recreated to retrieve the modified data
rel = goose_conn.table("code_example")

rel.show()

Result

┌──────────────────────────────────────┬─────────────────┬───────┬────────────────────────────┐
│                  id                  │   description   │ value │     created_timestamp      │
│                 uuid                 │     varchar     │ int64 │  timestamp with time zone  │
├──────────────────────────────────────┼─────────────────┼───────┼────────────────────────────┤
│ 66dcaa14-f4a6-4a55-af3b-7f6aa23ab4ad │ NULL            │     1 │ 2025-04-10 16:54:49.317+02 │
│ c6a18a42-67fb-4c95-827b-c966f2f95b88 │ value is even   │     2 │ 2025-04-10 16:55:49.317+02 │
...

Functions

This section contains the functions which can be applied to a relation, in order to get a (scalar) result. The functions are lazy evaluated.

Name	Description
`any_value`	Returns the first non-null value from a given column
`arg_max`	Finds the row with the maximum value for a value column and returns the value of that row for an argument column
`arg_min`	Finds the row with the minimum value for a value column and returns the value of that row for an argument column
`avg`	Computes the average on a given column
`bit_and`	Computes the bitwise AND of all bits present in a given column
`bit_or`	Computes the bitwise OR of all bits present in a given column
`bit_xor`	Computes the bitwise XOR of all bits present in a given column
`bitstring_agg`	Computes a bitstring with bits set for each distinct value in a given column
`bool_and`	Computes the logical AND of all values present in a given column
`bool_or`	Computes the logical OR of all values present in a given column
`count`	Computes the number of elements present in a given column
`cume_dist`	Computes the cumulative distribution within the partition
`dense_rank`	Computes the dense rank within the partition
`distinct`	Retrieve distinct rows from this relation object
`favg`	Computes the average of all values present in a given column using a more accurate floating point summation (Kahan Sum)
`first`	Returns the first value of a given column
`first_value`	Computes the first value within the group or partition
`fsum`	Computes the sum of all values present in a given column using a more accurate floating point summation (Kahan Sum)
`geomean`	Computes the geometric mean over all values present in a given column
`histogram`	Computes the histogram over all values present in a given column
`lag`	Computes the lag within the partition
`last`	Returns the last value of a given column
`last_value`	Computes the last value within the group or partition
`lead`	Computes the lead within the partition
`list`	Returns a list containing all values present in a given column
`max`	Returns the maximum value present in a given column
`mean`	Computes the average on a given column
`median`	Computes the median over all values present in a given column
`min`	Returns the minimum value present in a given column
`mode`	Computes the mode over all values present in a given column
`n_tile`	Divides the partition as equally as possible into num_buckets
`nth_value`	Computes the nth value within the partition
`percent_rank`	Computes the relative rank within the partition
`product`	Returns the product of all values present in a given column
`quantile`	Computes the exact quantile value for a given column
`quantile_cont`	Computes the interpolated quantile value for a given column
`quantile_disc`	Computes the exact quantile value for a given column
`rank`	Computes the rank within the partition
`rank_dense`	Computes the dense rank within the partition
`row_number`	Computes the row number within the partition
`select_dtypes`	Select columns from the relation, by filtering based on type(s)
`select_types`	Select columns from the relation, by filtering based on type(s)
`std`	Computes the sample standard deviation for a given column
`stddev`	Computes the sample standard deviation for a given column
`stddev_pop`	Computes the population standard deviation for a given column
`stddev_samp`	Computes the sample standard deviation for a given column
`string_agg`	Concatenates the values present in a given column with a separator
`sum`	Computes the sum of all values present in a given column
`unique`	Returns the distinct values in a column.
`value_counts`	Computes the number of elements present in a given column, also projecting the original column
`var`	Computes the sample variance for a given column
`var_pop`	Computes the population variance for a given column
`var_samp`	Computes the sample variance for a given column
`variance`	Computes the sample variance for a given column

`any_value`

Signature

any_value(self: _goose.GoosePyRelation, column: str, groups: str = '', window_spec: str = '', projected_columns: str = '') -> _goose.GoosePyRelation

Description

Returns the first non-null value from a given column

Parameters

column : str

The column name from which to retrieve any value.
groups : str, default: ''

Comma-separated list of columns to include in the group by.
window_spec : str, default: ''

Optional window specification for window functions, provided as over (partition by ... order by ...).
projected_columns : str, default: ''

Comma-separated list of columns to include in the result.

Example

import goose

goose_conn = goose.connect()

rel = goose_conn.sql("""
        select 
            gen_random_uuid() as id, 
            concat('value is ', case when mod(range,2)=0 then 'even' else 'uneven' end) as description,
            range as value, 
            now() + concat(range,' ', 'minutes')::interval as created_timestamp
        from range(1, 10)
    """
)

rel.any_value('id')

Result

┌──────────────────────────────────────┐
│            any_value(id)             │
│                 uuid                 │
├──────────────────────────────────────┤
│ 642ea3d7-793d-4867-a759-91c1226c25a0 │
└──────────────────────────────────────┘

`arg_max`

Signature

arg_max(self: _goose.GoosePyRelation, arg_column: str, value_column: str, groups: str = '', window_spec: str = '', projected_columns: str = '') -> _goose.GoosePyRelation

Description

Finds the row with the maximum value for a value column and returns the value of that row for an argument column

Parameters

arg_column : str

The column name for which to find the argument maximizing the value.
value_column : str

The column name containing values used to determine the maximum.
groups : str, default: ''

Comma-separated list of columns to include in the group by.
window_spec : str, default: ''

Optional window specification for window functions, provided as over (partition by ... order by ...).
projected_columns : str, default: ''

Comma-separated list of columns to include in the result.

Example

import goose

goose_conn = goose.connect()

rel = goose_conn.sql("""
        select 
            gen_random_uuid() as id, 
            concat('value is ', case when mod(range,2)=0 then 'even' else 'uneven' end) as description,
            range as value, 
            now() + concat(range,' ', 'minutes')::interval as created_timestamp
        from range(1, 10)
    """
)

rel.arg_max(arg_column="value", value_column="value", groups="description", projected_columns="description")

Result

┌─────────────────┬───────────────────────────┐
│   description   │ arg_max("value", "value") │
│     varchar     │           int64           │
├─────────────────┼───────────────────────────┤
│ value is uneven │                         9 │
│ value is even   │                         8 │
└─────────────────┴───────────────────────────┘

`arg_min`

Signature

arg_min(self: _goose.GoosePyRelation, arg_column: str, value_column: str, groups: str = '', window_spec: str = '', projected_columns: str = '') -> _goose.GoosePyRelation

Description

Finds the row with the minimum value for a value column and returns the value of that row for an argument column

Parameters

arg_column : str

The column name for which to find the argument minimizing the value.
value_column : str

The column name containing values used to determine the minimum.
groups : str, default: ''

Comma-separated list of columns to include in the group by.
window_spec : str, default: ''

Optional window specification for window functions, provided as over (partition by ... order by ...)
projected_columns : str, default: ''

Comma-separated list of columns to include in the result.

Example

import goose

goose_conn = goose.connect()

rel = goose_conn.sql("""
        select 
            gen_random_uuid() as id, 
            concat('value is ', case when mod(range,2)=0 then 'even' else 'uneven' end) as description,
            range as value, 
            now() + concat(range,' ', 'minutes')::interval as created_timestamp
        from range(1, 10)
    """
)

rel.arg_min(arg_column="value", value_column="value", groups="description", projected_columns="description")

Result

┌─────────────────┬───────────────────────────┐
│   description   │ arg_min("value", "value") │
│     varchar     │           int64           │
├─────────────────┼───────────────────────────┤
│ value is even   │                         2 │
│ value is uneven │                         1 │
└─────────────────┴───────────────────────────┘

`avg`

Signature

avg(self: _goose.GoosePyRelation, column: str, groups: str = '', window_spec: str = '', projected_columns: str = '') -> _goose.GoosePyRelation

Description

Computes the average on a given column

Parameters

column : str

The column name to calculate the average on.
groups : str, default: ''

Comma-separated list of columns to include in the group by.
window_spec : str, default: ''

Optional window specification for window functions, provided as over (partition by ... order by ...)
projected_columns : str, default: ''

Comma-separated list of columns to include in the result.

Example

import goose

goose_conn = goose.connect()

rel = goose_conn.sql("""
        select 
            gen_random_uuid() as id, 
            concat('value is ', case when mod(range,2)=0 then 'even' else 'uneven' end) as description,
            range as value, 
            now() + concat(range,' ', 'minutes')::interval as created_timestamp
        from range(1, 10)
    """
)

rel.avg('value')

Result

┌──────────────┐
│ avg("value") │
│    double    │
├──────────────┤
│          5.0 │
└──────────────┘
 

`bit_and`

Signature

bit_and(self: _goose.GoosePyRelation, column: str, groups: str = '', window_spec: str = '', projected_columns: str = '') -> _goose.GoosePyRelation

Description

Computes the bitwise AND of all bits present in a given column

Parameters

column : str

The column name to perform the bitwise AND aggregation on.
groups : str, default: ''

Comma-separated list of columns to include in the group by.
window_spec : str, default: ''

Optional window specification for window functions, provided as over (partition by ... order by ...)
projected_columns : str, default: ''

Comma-separated list of columns to include in the result.

Example

import goose

goose_conn = goose.connect()

rel = goose_conn.sql("""
        select 
            gen_random_uuid() as id, 
            concat('value is ', case when mod(range,2)=0 then 'even' else 'uneven' end) as description,
            range as value, 
            now() + concat(range,' ', 'minutes')::interval as created_timestamp
        from range(1, 10)
    """
)

rel = rel.select("description, value::bit as value_bit")

rel.bit_and(column="value_bit", groups="description", projected_columns="description")

Result

┌─────────────────┬──────────────────────────────────────────────────────────────────┐
│   description   │                        bit_and(value_bit)                        │
│     varchar     │                               bit                                │
├─────────────────┼──────────────────────────────────────────────────────────────────┤
│ value is uneven │ 0000000000000000000000000000000000000000000000000000000000000001 │
│ value is even   │ 0000000000000000000000000000000000000000000000000000000000000000 │
└─────────────────┴──────────────────────────────────────────────────────────────────┘    

`bit_or`

Signature

bit_or(self: _goose.GoosePyRelation, column: str, groups: str = '', window_spec: str = '', projected_columns: str = '') -> _goose.GoosePyRelation

Description

Computes the bitwise OR of all bits present in a given column

Parameters

column : str

The column name to perform the bitwise OR aggregation on.
groups : str, default: ''

Comma-separated list of columns to include in the group by.
window_spec : str, default: ''

Optional window specification for window functions, provided as over (partition by ... order by ...)
projected_columns : str, default: ''

Comma-separated list of columns to include in the result.

Example

import goose

goose_conn = goose.connect()

rel = goose_conn.sql("""
        select 
            gen_random_uuid() as id, 
            concat('value is ', case when mod(range,2)=0 then 'even' else 'uneven' end) as description,
            range as value, 
            now() + concat(range,' ', 'minutes')::interval as created_timestamp
        from range(1, 10)
    """
)

rel = rel.select("description, value::bit as value_bit")

rel.bit_or(column="value_bit", groups="description", projected_columns="description")

Result

┌─────────────────┬──────────────────────────────────────────────────────────────────┐
│   description   │                        bit_or(value_bit)                         │
│     varchar     │                               bit                                │
├─────────────────┼──────────────────────────────────────────────────────────────────┤
│ value is uneven │ 0000000000000000000000000000000000000000000000000000000000001111 │
│ value is even   │ 0000000000000000000000000000000000000000000000000000000000001110 │
└─────────────────┴──────────────────────────────────────────────────────────────────┘    

`bit_xor`

Signature

bit_xor(self: _goose.GoosePyRelation, column: str, groups: str = '', window_spec: str = '', projected_columns: str = '') -> _goose.GoosePyRelation

Description

Computes the bitwise XOR of all bits present in a given column

Parameters

column : str

The column name to perform the bitwise XOR aggregation on.
groups : str, default: ''

Comma-separated list of columns to include in the group by.
window_spec : str, default: ''

Optional window specification for window functions, provided as over (partition by ... order by ...)
projected_columns : str, default: ''

Comma-separated list of columns to include in the result.

Example

import goose

goose_conn = goose.connect()

rel = goose_conn.sql("""
        select 
            gen_random_uuid() as id, 
            concat('value is ', case when mod(range,2)=0 then 'even' else 'uneven' end) as description,
            range as value, 
            now() + concat(range,' ', 'minutes')::interval as created_timestamp
        from range(1, 10)
    """
)

rel = rel.select("description, value::bit as value_bit")

rel.bit_xor(column="value_bit", groups="description", projected_columns="description")

Result

┌─────────────────┬──────────────────────────────────────────────────────────────────┐
│   description   │                        bit_xor(value_bit)                        │
│     varchar     │                               bit                                │
├─────────────────┼──────────────────────────────────────────────────────────────────┤
│ value is even   │ 0000000000000000000000000000000000000000000000000000000000001000 │
│ value is uneven │ 0000000000000000000000000000000000000000000000000000000000001001 │
└─────────────────┴──────────────────────────────────────────────────────────────────┘

`bitstring_agg`

Signature

bitstring_agg(self: _goose.GoosePyRelation, column: str, min: typing.Optional[object] = None, max: typing.Optional[object] = None, groups: str = '', window_spec: str = '', projected_columns: str = '') -> _goose.GoosePyRelation

Description

Computes a bitstring with bits set for each distinct value in a given column

Parameters

column : str

The column name to aggregate as a bitstring.
min : object, default: None

Optional minimum bitstring value for aggregation.
max : object, default: None

Optional maximum bitstring value for aggregation.
groups : str, default: ''

Comma-separated list of columns to include in the group by.
window_spec : str, default: ''

Optional window specification for window functions, provided as over (partition by ... order by ...)
projected_columns : str, default: ''

Comma-separated list of columns to include in the result.

Example

import goose

goose_conn = goose.connect()

rel = goose_conn.sql("""
        select 
            gen_random_uuid() as id, 
            concat('value is ', case when mod(range,2)=0 then 'even' else 'uneven' end) as description,
            range as value, 
            now() + concat(range,' ', 'minutes')::interval as created_timestamp
        from range(1, 10)
    """
)

rel.bitstring_agg(column="value", groups="description", projected_columns="description", min=1, max=9)

Result

┌─────────────────┬────────────────────────┐
│   description   │ bitstring_agg("value") │
│     varchar     │          bit           │
├─────────────────┼────────────────────────┤
│ value is uneven │ 101010101              │
│ value is even   │ 010101010              │
└─────────────────┴────────────────────────┘

`bool_and`

Signature

bool_and(self: _goose.GoosePyRelation, column: str, groups: str = '', window_spec: str = '', projected_columns: str = '') -> _goose.GoosePyRelation

Description

Computes the logical AND of all values present in a given column

Parameters

column : str

The column name to perform the boolean AND aggregation on.
groups : str, default: ''

Comma-separated list of columns to include in the group by.
window_spec : str, default: ''

Optional window specification for window functions, provided as over (partition by ... order by ...)
projected_columns : str, default: ''

Comma-separated list of columns to include in the result.

Example

import goose

goose_conn = goose.connect()

rel = goose_conn.sql("""
        select 
            gen_random_uuid() as id, 
            concat('value is ', case when mod(range,2)=0 then 'even' else 'uneven' end) as description,
            range as value, 
            now() + concat(range,' ', 'minutes')::interval as created_timestamp
        from range(1, 10)
    """
)

rel = rel.select("description, mod(value,2)::boolean as uneven")

rel.bool_and(column="uneven", groups="description", projected_columns="description")

Result

┌─────────────────┬──────────────────┐
│   description   │ bool_and(uneven) │
│     varchar     │     boolean      │
├─────────────────┼──────────────────┤
│ value is even   │ false            │
│ value is uneven │ true             │
└─────────────────┴──────────────────┘

`bool_or`

Signature

bool_or(self: _goose.GoosePyRelation, column: str, groups: str = '', window_spec: str = '', projected_columns: str = '') -> _goose.GoosePyRelation

Description

Computes the logical OR of all values present in a given column

Parameters

column : str

The column name to perform the boolean OR aggregation on.
groups : str, default: ''

Comma-separated list of columns to include in the group by.
window_spec : str, default: ''

Optional window specification for window functions, provided as over (partition by ... order by ...)
projected_columns : str, default: ''

Comma-separated list of columns to include in the result.

Example

import goose

goose_conn = goose.connect()

rel = goose_conn.sql("""
        select 
            gen_random_uuid() as id, 
            concat('value is ', case when mod(range,2)=0 then 'even' else 'uneven' end) as description,
            range as value, 
            now() + concat(range,' ', 'minutes')::interval as created_timestamp
        from range(1, 10)
    """
)

rel = rel.select("description, mod(value,2)::boolean as uneven")

rel.bool_or(column="uneven", groups="description", projected_columns="description")

Result

┌─────────────────┬─────────────────┐
│   description   │ bool_or(uneven) │
│     varchar     │     boolean     │
├─────────────────┼─────────────────┤
│ value is even   │ false           │
│ value is uneven │ true            │
└─────────────────┴─────────────────┘                

`count`

Signature

count(self: _goose.GoosePyRelation, column: str, groups: str = '', window_spec: str = '', projected_columns: str = '') -> _goose.GoosePyRelation

Description

Computes the number of elements present in a given column

Parameters

column : str

The column name to perform count on.
groups : str, default: ''

Comma-separated list of columns to include in the group by.
window_spec : str, default: ''

Optional window specification for window functions, provided as over (partition by ... order by ...)
projected_columns : str, default: ''

Comma-separated list of columns to include in the result.

Example

import goose

goose_conn = goose.connect()

rel = goose_conn.sql("""
        select 
            gen_random_uuid() as id, 
            concat('value is ', case when mod(range,2)=0 then 'even' else 'uneven' end) as description,
            range as value, 
            now() + concat(range,' ', 'minutes')::interval as created_timestamp
        from range(1, 10)
    """
)

rel.count("id")

Result

┌───────────┐
│ count(id) │
│   int64   │
├───────────┤
│         9 │
└───────────┘

`cume_dist`

Signature

cume_dist(self: _goose.GoosePyRelation, window_spec: str, projected_columns: str = '') -> _goose.GoosePyRelation

Description

Computes the cumulative distribution within the partition

Parameters

window_spec : str

Optional window specification for window functions, provided as over (partition by ... order by ...)
projected_columns : str, default: ''

Comma-separated list of columns to include in the result.

Example

import goose

goose_conn = goose.connect()

rel = goose_conn.sql("""
        select 
            gen_random_uuid() as id, 
            concat('value is ', case when mod(range,2)=0 then 'even' else 'uneven' end) as description,
            range as value, 
            now() + concat(range,' ', 'minutes')::interval as created_timestamp
        from range(1, 10)
    """
)

rel.cume_dist(window_spec="over (partition by description order by value)", projected_columns="description, value")

Result

┌─────────────────┬───────┬──────────────────────────────────────────────────────────────┐
│   description   │ value │ cume_dist() OVER (PARTITION BY description ORDER BY "value") │
│     varchar     │ int64 │                            double                            │
├─────────────────┼───────┼──────────────────────────────────────────────────────────────┤
│ value is uneven │     1 │                                                          0.2 │
│ value is uneven │     3 │                                                          0.4 │
│ value is uneven │     5 │                                                          0.6 │
│ value is uneven │     7 │                                                          0.8 │
│ value is uneven │     9 │                                                          1.0 │
│ value is even   │     2 │                                                         0.25 │
│ value is even   │     4 │                                                          0.5 │
│ value is even   │     6 │                                                         0.75 │
│ value is even   │     8 │                                                          1.0 │
└─────────────────┴───────┴──────────────────────────────────────────────────────────────┘

`dense_rank`

Signature

dense_rank(self: _goose.GoosePyRelation, window_spec: str, projected_columns: str = '') -> _goose.GoosePyRelation

Description

Computes the dense rank within the partition

Aliases: rank_dense

Parameters

window_spec : str

Optional window specification for window functions, provided as over (partition by ... order by ...)
projected_columns : str, default: ''

Comma-separated list of columns to include in the result.

Example

import goose

goose_conn = goose.connect()

rel = goose_conn.sql("""
        select 
            gen_random_uuid() as id, 
            concat('value is ', case when mod(range,2)=0 then 'even' else 'uneven' end) as description,
            range as value, 
            now() + concat(range,' ', 'minutes')::interval as created_timestamp
        from range(1, 10)
    """
)

 rel.dense_rank(window_spec="over (partition by description order by value)", projected_columns="description, value")

Result

┌─────────────────┬───────┬───────────────────────────────────────────────────────────────┐
│   description   │ value │ dense_rank() OVER (PARTITION BY description ORDER BY "value") │
│     varchar     │ int64 │                             int64                             │
├─────────────────┼───────┼───────────────────────────────────────────────────────────────┤
│ value is even   │     2 │                                                             1 │
│ value is even   │     4 │                                                             2 │
│ value is even   │     6 │                                                             3 │
│ value is even   │     8 │                                                             4 │
│ value is uneven │     1 │                                                             1 │
│ value is uneven │     3 │                                                             2 │
│ value is uneven │     5 │                                                             3 │
│ value is uneven │     7 │                                                             4 │
│ value is uneven │     9 │                                                             5 │
└─────────────────┴───────┴───────────────────────────────────────────────────────────────┘

`distinct`

Signature

distinct(self: _goose.GoosePyRelation) -> _goose.GoosePyRelation

Description

Retrieve distinct rows from this relation object

Example

import goose

goose_conn = goose.connect()

rel = goose_conn.sql("select range from range(1,4)")

rel = rel.union(union_rel=rel)

rel.distinct().order("range")

Result

┌───────┐
│ range │
│ int64 │
├───────┤
│     1 │
│     2 │
│     3 │
└───────┘

`favg`

Signature

favg(self: _goose.GoosePyRelation, column: str, groups: str = '', window_spec: str = '', projected_columns: str = '') -> _goose.GoosePyRelation

Description

Computes the average of all values present in a given column using a more accurate floating point summation (Kahan Sum)

Parameters

column : str

The column name to calculate the average on.
groups : str, default: ''

Comma-separated list of columns to include in the group by.
window_spec : str, default: ''

Optional window specification for window functions, provided as over (partition by ... order by ...)
projected_columns : str, default: ''

Comma-separated list of columns to include in the result.

Example

import goose

goose_conn = goose.connect()

rel = goose_conn.sql("""
        select 
            gen_random_uuid() as id, 
            concat('value is ', case when mod(range,2)=0 then 'even' else 'uneven' end) as description,
            range as value, 
            now() + concat(range,' ', 'minutes')::interval as created_timestamp
        from range(1, 10)
    """
)

rel.favg(column="value", groups="description", projected_columns="description")

Result

┌─────────────────┬───────────────┐
│   description   │ favg("value") │
│     varchar     │    double     │
├─────────────────┼───────────────┤
│ value is uneven │           5.0 │
│ value is even   │           5.0 │
└─────────────────┴───────────────┘

`first`

Signature

first(self: _goose.GoosePyRelation, column: str, groups: str = '', projected_columns: str = '') -> _goose.GoosePyRelation

Description

Returns the first value of a given column

Parameters

column : str

The column name from which to retrieve the first value.
groups : str, default: ''

Comma-separated list of columns to include in the group by.
projected_columns : str, default: ''

Comma-separated list of columns to include in the result.

Example

import goose

goose_conn = goose.connect()

rel = goose_conn.sql("""
        select 
            gen_random_uuid() as id, 
            concat('value is ', case when mod(range,2)=0 then 'even' else 'uneven' end) as description,
            range as value, 
            now() + concat(range,' ', 'minutes')::interval as created_timestamp
        from range(1, 10)
    """
)

rel.first(column="value", groups="description", projected_columns="description")

Result

┌─────────────────┬──────────────────┐
│   description   │ "first"("value") │
│     varchar     │      int64       │
├─────────────────┼──────────────────┤
│ value is even   │                2 │
│ value is uneven │                1 │
└─────────────────┴──────────────────┘

`first_value`

Signature

first_value(self: _goose.GoosePyRelation, column: str, window_spec: str = '', projected_columns: str = '') -> _goose.GoosePyRelation

Description

Computes the first value within the group or partition

Parameters

column : str

The column name from which to retrieve the first value.
groups : str, default: ''

Comma-separated list of columns to include in the group by.
projected_columns : str, default: ''

Comma-separated list of columns to include in the result.

Example

import goose

goose_conn = goose.connect()

rel = goose_conn.sql("""
        select 
            gen_random_uuid() as id, 
            concat('value is ', case when mod(range,2)=0 then 'even' else 'uneven' end) as description,
            range as value, 
            now() + concat(range,' ', 'minutes')::interval as created_timestamp
        from range(1, 10)
    """
)

rel.first_value(column="value", window_spec="over (partition by description order by value)", projected_columns="description").distinct()

Result

┌─────────────────┬───────────────────────────────────────────────────────────────────────┐
│   description   │ first_value("value") OVER (PARTITION BY description ORDER BY "value") │
│     varchar     │                                 int64                                 │
├─────────────────┼───────────────────────────────────────────────────────────────────────┤
│ value is even   │                                                                     2 │
│ value is uneven │                                                                     1 │
└─────────────────┴───────────────────────────────────────────────────────────────────────┘

`fsum`

Signature

fsum(self: _goose.GoosePyRelation, column: str, groups: str = '', window_spec: str = '', projected_columns: str = '') -> _goose.GoosePyRelation

Description

Computes the sum of all values present in a given column using a more accurate floating point summation (Kahan Sum)

Parameters

column : str

The column name to calculate the sum on.
groups : str, default: ''

Comma-separated list of columns to include in the group by.
window_spec : str, default: ''

Optional window specification for window functions, provided as over (partition by ... order by ...)
projected_columns : str, default: ''

Comma-separated list of columns to include in the result.

Example

import goose

goose_conn = goose.connect()

rel = goose_conn.sql("""
        select 
            gen_random_uuid() as id, 
            concat('value is ', case when mod(range,2)=0 then 'even' else 'uneven' end) as description,
            range as value, 
            now() + concat(range,' ', 'minutes')::interval as created_timestamp
        from range(1, 10)
    """
)

rel.fsum(column="value", groups="description", projected_columns="description")

Result

┌─────────────────┬───────────────┐
│   description   │ fsum("value") │
│     varchar     │    double     │
├─────────────────┼───────────────┤
│ value is even   │          20.0 │
│ value is uneven │          25.0 │
└─────────────────┴───────────────┘

`geomean`

Signature

geomean(self: _goose.GoosePyRelation, column: str, groups: str = '', projected_columns: str = '') -> _goose.GoosePyRelation

Description

Computes the geometric mean over all values present in a given column

Parameters

column : str

The column name to calculate the geometric mean on.
groups : str, default: ''

Comma-separated list of columns to include in the group by.
projected_columns : str, default: ''

Comma-separated list of columns to include in the result.

Example

import goose

goose_conn = goose.connect()

rel = goose_conn.sql("""
        select 
            gen_random_uuid() as id, 
            concat('value is ', case when mod(range,2)=0 then 'even' else 'uneven' end) as description,
            range as value, 
            now() + concat(range,' ', 'minutes')::interval as created_timestamp
        from range(1, 10)
    """
)

rel.geomean(column="value", groups="description", projected_columns="description")

Result

┌─────────────────┬───────────────────┐
│   description   │ geomean("value")  │
│     varchar     │      double       │
├─────────────────┼───────────────────┤
│ value is uneven │ 3.936283427035351 │
│ value is even   │ 4.426727678801287 │
└─────────────────┴───────────────────┘

`histogram`

Signature

histogram(self: _goose.GoosePyRelation, column: str, groups: str = '', window_spec: str = '', projected_columns: str = '') -> _goose.GoosePyRelation

Description

Computes the histogram over all values present in a given column

Parameters

column : str

The column name to calculate the histogram on.
window_spec : str, default: ''

Optional window specification for window functions, provided as over (partition by ... order by ...)
projected_columns : str, default: ''

Comma-separated list of columns to include in the result.

Example

import goose

goose_conn = goose.connect()

rel = goose_conn.sql("""
        select 
            gen_random_uuid() as id, 
            concat('value is ', case when mod(range,2)=0 then 'even' else 'uneven' end) as description,
            range as value, 
            now() + concat(range,' ', 'minutes')::interval as created_timestamp
        from range(1, 10)
    """
)

rel.histogram(column="value", groups="description", projected_columns="description")

Result

┌─────────────────┬───────────────────────────┐
│   description   │    histogram("value")     │
│     varchar     │   map(bigint, ubigint)    │
├─────────────────┼───────────────────────────┤
│ value is uneven │ {1=1, 3=1, 5=1, 7=1, 9=1} │
│ value is even   │ {2=1, 4=1, 6=1, 8=1}      │
└─────────────────┴───────────────────────────┘

`lag`

Signature

lag(self: _goose.GoosePyRelation, column: str, window_spec: str, offset: typing.SupportsInt = 1, default_value: str = 'NULL', ignore_nulls: bool = False, projected_columns: str = '') -> _goose.GoosePyRelation

Description

Computes the lag within the partition

Parameters

column : str

The column name to apply the lag function on.
window_spec : str

Optional window specification for window functions, provided as over (partition by ... order by ...)
offset : int, default: 1

The number of rows to lag behind.
default_value : str, default: 'NULL'

The default value to return when the lag offset goes out of bounds.
ignore_nulls : bool, default: False

Whether to ignore NULL values when computing the lag.
projected_columns : str, default: ''

Comma-separated list of columns to include in the result.

Example

import goose

goose_conn = goose.connect()

rel = goose_conn.sql("""
        select 
            gen_random_uuid() as id, 
            concat('value is ', case when mod(range,2)=0 then 'even' else 'uneven' end) as description,
            range as value, 
            now() + concat(range,' ', 'minutes')::interval as created_timestamp
        from range(1, 10)
    """
)

rel.lag(column="description", window_spec="over (order by value)", projected_columns="description, value")

Result

┌─────────────────┬───────┬───────────────────────────────────────────────────┐
│   description   │ value │ lag(description, 1, NULL) OVER (ORDER BY "value") │
│     varchar     │ int64 │                      varchar                      │
├─────────────────┼───────┼───────────────────────────────────────────────────┤
│ value is uneven │     1 │ NULL                                              │
│ value is even   │     2 │ value is uneven                                   │
│ value is uneven │     3 │ value is even                                     │
│ value is even   │     4 │ value is uneven                                   │
│ value is uneven │     5 │ value is even                                     │
│ value is even   │     6 │ value is uneven                                   │
│ value is uneven │     7 │ value is even                                     │
│ value is even   │     8 │ value is uneven                                   │
│ value is uneven │     9 │ value is even                                     │
└─────────────────┴───────┴───────────────────────────────────────────────────┘

`last`

Signature

last(self: _goose.GoosePyRelation, column: str, groups: str = '', projected_columns: str = '') -> _goose.GoosePyRelation

Description

Returns the last value of a given column

Parameters

column : str

The column name from which to retrieve the last value.
groups : str, default: ''

Comma-separated list of columns to include in the group by.
projected_columns : str, default: ''

Comma-separated list of columns to include in the result.

Example

import goose

goose_conn = goose.connect()

rel = goose_conn.sql("""
        select 
            gen_random_uuid() as id, 
            concat('value is ', case when mod(range,2)=0 then 'even' else 'uneven' end) as description,
            range as value, 
            now() + concat(range,' ', 'minutes')::interval as created_timestamp
        from range(1, 10)
    """
)

rel.last(column="value", groups="description", projected_columns="description")

Result

┌─────────────────┬─────────────────┐
│   description   │ "last"("value") │
│     varchar     │      int64      │
├─────────────────┼─────────────────┤
│ value is even   │               8 │
│ value is uneven │               9 │
└─────────────────┴─────────────────┘

`last_value`

Signature

last_value(self: _goose.GoosePyRelation, column: str, window_spec: str = '', projected_columns: str = '') -> _goose.GoosePyRelation

Description

Computes the last value within the group or partition

Parameters

column : str

The column name from which to retrieve the last value within the window.
window_spec : str, default: ''

Optional window specification for window functions, provided as over (partition by ... order by ...)
projected_columns : str, default: ''

Comma-separated list of columns to include in the result.

Example

import goose

goose_conn = goose.connect()

rel = goose_conn.sql("""
        select 
            gen_random_uuid() as id, 
            concat('value is ', case when mod(range,2)=0 then 'even' else 'uneven' end) as description,
            range as value, 
            now() + concat(range,' ', 'minutes')::interval as created_timestamp
        from range(1, 10)
    """
)

rel.last_value(column="value", window_spec="over (order by description)", projected_columns="description").distinct()

Result

┌─────────────────┬─────────────────────────────────────────────────┐
│   description   │ last_value("value") OVER (ORDER BY description) │
│     varchar     │                      int64                      │
├─────────────────┼─────────────────────────────────────────────────┤
│ value is uneven │                                               9 │
│ value is even   │                                               8 │
└─────────────────┴─────────────────────────────────────────────────┘

`lead`

Signature

lead(self: _goose.GoosePyRelation, column: str, window_spec: str, offset: typing.SupportsInt = 1, default_value: str = 'NULL', ignore_nulls: bool = False, projected_columns: str = '') -> _goose.GoosePyRelation

Description

Computes the lead within the partition

Parameters

column : str

The column name to apply the lead function on.
window_spec : str

Optional window specification for window functions, provided as over (partition by ... order by ...)
offset : int, default: 1

The number of rows to lead ahead.
default_value : str, default: 'NULL'

The default value to return when the lead offset goes out of bounds.
ignore_nulls : bool, default: False

Whether to ignore NULL values when computing the lead.
projected_columns : str, default: ''

Comma-separated list of columns to include in the result.

Example

import goose

goose_conn = goose.connect()

rel = goose_conn.sql("""
        select 
            gen_random_uuid() as id, 
            concat('value is ', case when mod(range,2)=0 then 'even' else 'uneven' end) as description,
            range as value, 
            now() + concat(range,' ', 'minutes')::interval as created_timestamp
        from range(1, 10)
    """
)

rel.lead(column="description", window_spec="over (order by value)", projected_columns="description, value")

Result

┌─────────────────┬───────┬────────────────────────────────────────────────────┐
│   description   │ value │ lead(description, 1, NULL) OVER (ORDER BY "value") │
│     varchar     │ int64 │                      varchar                       │
├─────────────────┼───────┼────────────────────────────────────────────────────┤
│ value is uneven │     1 │ value is even                                      │
│ value is even   │     2 │ value is uneven                                    │
│ value is uneven │     3 │ value is even                                      │
│ value is even   │     4 │ value is uneven                                    │
│ value is uneven │     5 │ value is even                                      │
│ value is even   │     6 │ value is uneven                                    │
│ value is uneven │     7 │ value is even                                      │
│ value is even   │     8 │ value is uneven                                    │
│ value is uneven │     9 │ NULL                                               │
└─────────────────┴───────┴────────────────────────────────────────────────────┘

`list`

Signature

list(self: _goose.GoosePyRelation, column: str, groups: str = '', window_spec: str = '', projected_columns: str = '') -> _goose.GoosePyRelation

Description

Returns a list containing all values present in a given column

Parameters

column : str

The column name to aggregate values into a list.
groups : str, default: ''

Comma-separated list of columns to include in the group by.
window_spec : str, default: ''

Optional window specification for window functions, provided as over (partition by ... order by ...)
projected_columns : str, default: ''

Comma-separated list of columns to include in the result.

Example

import goose

goose_conn = goose.connect()

rel = goose_conn.sql("""
        select 
            gen_random_uuid() as id, 
            concat('value is ', case when mod(range,2)=0 then 'even' else 'uneven' end) as description,
            range as value, 
            now() + concat(range,' ', 'minutes')::interval as created_timestamp
        from range(1, 10)
    """
)

rel.list(column="value", groups="description", projected_columns="description")

Result

┌─────────────────┬─────────────────┐
│   description   │  list("value")  │
│     varchar     │     int64[]     │
├─────────────────┼─────────────────┤
│ value is even   │ [2, 4, 6, 8]    │
│ value is uneven │ [1, 3, 5, 7, 9] │
└─────────────────┴─────────────────┘

`max`

Signature

max(self: _goose.GoosePyRelation, column: str, groups: str = '', window_spec: str = '', projected_columns: str = '') -> _goose.GoosePyRelation

Description

Returns the maximum value present in a given column

Parameters

column : str

The column name to calculate the maximum value of.
groups : str, default: ''

Comma-separated list of columns to include in the group by.
window_spec : str, default: ''

Optional window specification for window functions, provided as over (partition by ... order by ...)
projected_columns : str, default: ''

Comma-separated list of columns to include in the result.

Example

import goose

goose_conn = goose.connect()

rel = goose_conn.sql("""
        select 
            gen_random_uuid() as id, 
            concat('value is ', case when mod(range,2)=0 then 'even' else 'uneven' end) as description,
            range as value, 
            now() + concat(range,' ', 'minutes')::interval as created_timestamp
        from range(1, 10)
    """
)

 rel.max(column="value", groups="description", projected_columns="description")

Result

┌─────────────────┬──────────────┐
│   description   │ max("value") │
│     varchar     │    int64     │
├─────────────────┼──────────────┤
│ value is even   │            8 │
│ value is uneven │            9 │
└─────────────────┴──────────────┘

`mean`

Signature

mean(self: _goose.GoosePyRelation, column: str, groups: str = '', window_spec: str = '', projected_columns: str = '') -> _goose.GoosePyRelation

Description

Computes the average on a given column

Parameters

column : str

The column name to calculate the mean value of.
groups : str, default: ''

Comma-separated list of columns to include in the group by.
window_spec : str, default: ''

Optional window specification for window functions, provided as over (partition by ... order by ...)
projected_columns : str, default: ''

Comma-separated list of columns to include in the result.

Example

import goose

goose_conn = goose.connect()

rel = goose_conn.sql("""
        select 
            gen_random_uuid() as id, 
            concat('value is ', case when mod(range,2)=0 then 'even' else 'uneven' end) as description,
            range as value, 
            now() + concat(range,' ', 'minutes')::interval as created_timestamp
        from range(1, 10)
    """
)

rel.mean(column="value", groups="description", projected_columns="description")

Result

┌─────────────────┬──────────────┐
│   description   │ avg("value") │
│     varchar     │    double    │
├─────────────────┼──────────────┤
│ value is even   │          5.0 │
│ value is uneven │          5.0 │
└─────────────────┴──────────────┘

`median`

Signature

median(self: _goose.GoosePyRelation, column: str, groups: str = '', window_spec: str = '', projected_columns: str = '') -> _goose.GoosePyRelation

Description

Computes the median over all values present in a given column

Parameters

column : str

The column name to calculate the median value of.
groups : str, default: ''

Comma-separated list of columns to include in the group by.
window_spec : str, default: ''

Optional window specification for window functions, provided as over (partition by ... order by ...)
projected_columns : str, default: ''

Comma-separated list of columns to include in the result.

Example

import goose

goose_conn = goose.connect()

rel = goose_conn.sql("""
        select 
            gen_random_uuid() as id, 
            concat('value is ', case when mod(range,2)=0 then 'even' else 'uneven' end) as description,
            range as value, 
            now() + concat(range,' ', 'minutes')::interval as created_timestamp
        from range(1, 10)
    """
)

rel.median(column="value", groups="description", projected_columns="description")

Result

┌─────────────────┬─────────────────┐
│   description   │ median("value") │
│     varchar     │     double      │
├─────────────────┼─────────────────┤
│ value is even   │             5.0 │
│ value is uneven │             5.0 │
└─────────────────┴─────────────────┘

`min`

Signature

min(self: _goose.GoosePyRelation, column: str, groups: str = '', window_spec: str = '', projected_columns: str = '') -> _goose.GoosePyRelation

Description

Returns the minimum value present in a given column

Parameters

column : str

The column name to calculate the min value of.
groups : str, default: ''

Comma-separated list of columns to include in the group by.
window_spec : str, default: ''

Optional window specification for window functions, provided as over (partition by ... order by ...)
projected_columns : str, default: ''

Comma-separated list of columns to include in the result.

Example

import goose

goose_conn = goose.connect()

rel = goose_conn.sql("""
        select 
            gen_random_uuid() as id, 
            concat('value is ', case when mod(range,2)=0 then 'even' else 'uneven' end) as description,
            range as value, 
            now() + concat(range,' ', 'minutes')::interval as created_timestamp
        from range(1, 10)
    """
)

rel.min(column="value", groups="description", projected_columns="description")

Result

┌─────────────────┬──────────────┐
│   description   │ min("value") │
│     varchar     │    int64     │
├─────────────────┼──────────────┤
│ value is uneven │            1 │
│ value is even   │            2 │
└─────────────────┴──────────────┘

`mode`

Signature

mode(self: _goose.GoosePyRelation, column: str, groups: str = '', window_spec: str = '', projected_columns: str = '') -> _goose.GoosePyRelation

Description

Computes the mode over all values present in a given column

Parameters

column : str

The column name to calculate the mode (most frequent value) of.
groups : str, default: ''

Comma-separated list of columns to include in the group by.
window_spec : str, default: ''

Optional window specification for window functions, provided as over (partition by ... order by ...)
projected_columns : str, default: ''

Comma-separated list of columns to include in the result.

Example

import goose

goose_conn = goose.connect()

rel = goose_conn.sql("""
        select 
            gen_random_uuid() as id, 
            concat('value is ', case when mod(range,2)=0 then 'even' else 'uneven' end) as description,
            range as value, 
            now() + concat(range,' ', 'minutes')::interval as created_timestamp
        from range(1, 10)
    """
)

rel.mode(column="value", groups="description", projected_columns="description")

Result

┌─────────────────┬─────────────────┐
│   description   │ "mode"("value") │
│     varchar     │      int64      │
├─────────────────┼─────────────────┤
│ value is uneven │               1 │
│ value is even   │               2 │
└─────────────────┴─────────────────┘

`n_tile`

Signature

n_tile(self: _goose.GoosePyRelation, window_spec: str, num_buckets: typing.SupportsInt, projected_columns: str = '') -> _goose.GoosePyRelation

Description

Divides the partition as equally as possible into num_buckets

Parameters

window_spec : str

Optional window specification for window functions, provided as over (partition by ... order by ...)
num_buckets : int

The number of buckets to divide the rows into.
projected_columns : str, default: ''

Comma-separated list of columns to include in the result.

Example

import goose

goose_conn = goose.connect()

rel = goose_conn.sql("""
        select 
            gen_random_uuid() as id, 
            concat('value is ', case when mod(range,2)=0 then 'even' else 'uneven' end) as description,
            range as value, 
            now() + concat(range,' ', 'minutes')::interval as created_timestamp
        from range(1, 10)
    """
)

rel.n_tile(window_spec="over (partition by description)", num_buckets=2, projected_columns="description, value")

Result

┌─────────────────┬───────┬──────────────────────────────────────────┐
│   description   │ value │ ntile(2) OVER (PARTITION BY description) │
│     varchar     │ int64 │                  int64                   │
├─────────────────┼───────┼──────────────────────────────────────────┤
│ value is uneven │     1 │                                        1 │
│ value is uneven │     3 │                                        1 │
│ value is uneven │     5 │                                        1 │
│ value is uneven │     7 │                                        2 │
│ value is uneven │     9 │                                        2 │
│ value is even   │     2 │                                        1 │
│ value is even   │     4 │                                        1 │
│ value is even   │     6 │                                        2 │
│ value is even   │     8 │                                        2 │
└─────────────────┴───────┴──────────────────────────────────────────┘

`nth_value`

Signature

nth_value(self: _goose.GoosePyRelation, column: str, window_spec: str, offset: typing.SupportsInt, ignore_nulls: bool = False, projected_columns: str = '') -> _goose.GoosePyRelation

Description

Computes the nth value within the partition

Parameters

column : str

The column name from which to retrieve the nth value within the window.
window_spec : str

Optional window specification for window functions, provided as over (partition by ... order by ...)
offset : int

The position of the value to retrieve within the window (1-based index).
ignore_nulls : bool, default: False

Whether to ignore NULL values when computing the nth value.
projected_columns : str, default: ''

Comma-separated list of columns to include in the result.

Example

import goose

goose_conn = goose.connect()

rel = goose_conn.sql("""
        select 
            gen_random_uuid() as id, 
            concat('value is ', case when mod(range,2)=0 then 'even' else 'uneven' end) as description,
            range as value, 
            now() + concat(range,' ', 'minutes')::interval as created_timestamp
        from range(1, 10)
    """
)

rel.nth_value(column="value", window_spec="over (partition by description)", projected_columns="description", offset=1)

Result

┌─────────────────┬───────────────────────────────────────────────────────┐
│   description   │ nth_value("value", 1) OVER (PARTITION BY description) │
│     varchar     │                         int64                         │
├─────────────────┼───────────────────────────────────────────────────────┤
│ value is even   │                                                     2 │
│ value is even   │                                                     2 │
│ value is even   │                                                     2 │
│ value is even   │                                                     2 │
│ value is uneven │                                                     1 │
│ value is uneven │                                                     1 │
│ value is uneven │                                                     1 │
│ value is uneven │                                                     1 │
│ value is uneven │                                                     1 │
└─────────────────┴───────────────────────────────────────────────────────┘

`percent_rank`

Signature

percent_rank(self: _goose.GoosePyRelation, window_spec: str, projected_columns: str = '') -> _goose.GoosePyRelation

Description

Computes the relative rank within the partition

Parameters

window_spec : str

Optional window specification for window functions, provided as over (partition by ... order by ...)
projected_columns : str, default: ''

Comma-separated list of columns to include in the result.

Example

import goose

goose_conn = goose.connect()

rel = goose_conn.sql("""
        select 
            gen_random_uuid() as id, 
            concat('value is ', case when mod(range,2)=0 then 'even' else 'uneven' end) as description,
            range as value, 
            now() + concat(range,' ', 'minutes')::interval as created_timestamp
        from range(1, 10)
    """
)

rel.percent_rank(window_spec="over (partition by description order by value)", projected_columns="description, value")

Result

┌─────────────────┬───────┬─────────────────────────────────────────────────────────────────┐
│   description   │ value │ percent_rank() OVER (PARTITION BY description ORDER BY "value") │
│     varchar     │ int64 │                             double                              │
├─────────────────┼───────┼─────────────────────────────────────────────────────────────────┤
│ value is even   │     2 │                                                             0.0 │
│ value is even   │     4 │                                              0.3333333333333333 │
│ value is even   │     6 │                                              0.6666666666666666 │
│ value is even   │     8 │                                                             1.0 │
│ value is uneven │     1 │                                                             0.0 │
│ value is uneven │     3 │                                                            0.25 │
│ value is uneven │     5 │                                                             0.5 │
│ value is uneven │     7 │                                                            0.75 │
│ value is uneven │     9 │                                                             1.0 │
└─────────────────┴───────┴─────────────────────────────────────────────────────────────────┘

`product`

Signature

product(self: _goose.GoosePyRelation, column: str, groups: str = '', window_spec: str = '', projected_columns: str = '') -> _goose.GoosePyRelation

Description

Returns the product of all values present in a given column

Parameters

column : str

The column name to calculate the product of.
groups : str, default: ''

Comma-separated list of columns to include in the group by.
window_spec : str, default: ''

Optional window specification for window functions, provided as over (partition by ... order by ...)
projected_columns : str, default: ''

Comma-separated list of columns to include in the result.

Example

import goose

goose_conn = goose.connect()

rel = goose_conn.sql("""
        select 
            gen_random_uuid() as id, 
            concat('value is ', case when mod(range,2)=0 then 'even' else 'uneven' end) as description,
            range as value, 
            now() + concat(range,' ', 'minutes')::interval as created_timestamp
        from range(1, 10)
    """
)

rel.product(column="value", groups="description", projected_columns="description")

Result

┌─────────────────┬──────────────────┐
│   description   │ product("value") │
│     varchar     │      double      │
├─────────────────┼──────────────────┤
│ value is uneven │            945.0 │
│ value is even   │            384.0 │
└─────────────────┴──────────────────┘

`quantile`

Signature

quantile(self: _goose.GoosePyRelation, column: str, q: object = 0.5, groups: str = '', window_spec: str = '', projected_columns: str = '') -> _goose.GoosePyRelation

Description

Computes the exact quantile value for a given column

Parameters

column : str

The column name to compute the quantile for.
q : object, default: 0.5

The quantile value to compute (e.g., 0.5 for median).
groups : str, default: ''

Comma-separated list of columns to include in the group by.
window_spec : str, default: ''

Optional window specification for window functions, provided as over (partition by ... order by ...)
projected_columns : str, default: ''

Comma-separated list of columns to include in the result.

Example

import goose

goose_conn = goose.connect()

rel = goose_conn.sql("""
        select 
            gen_random_uuid() as id, 
            concat('value is ', case when mod(range,2)=0 then 'even' else 'uneven' end) as description,
            range as value, 
            now() + concat(range,' ', 'minutes')::interval as created_timestamp
        from range(1, 10)
    """
)

rel.quantile(column="value", groups="description", projected_columns="description")

Result

┌─────────────────┬──────────────────────────────────┐
│   description   │ quantile_disc("value", 0.500000) │
│     varchar     │              int64               │
├─────────────────┼──────────────────────────────────┤
│ value is uneven │                                5 │
│ value is even   │                                4 │
└─────────────────┴──────────────────────────────────┘

`quantile_cont`

Signature

quantile_cont(self: _goose.GoosePyRelation, column: str, q: object = 0.5, groups: str = '', window_spec: str = '', projected_columns: str = '') -> _goose.GoosePyRelation

Description

Computes the interpolated quantile value for a given column

Parameters

column : str

The column name to compute the continuous quantile for.
q : object, default: 0.5

The quantile value to compute (e.g., 0.5 for median).
groups : str, default: ''

Comma-separated list of columns to include in the group by.
window_spec : str, default: ''

Optional window specification for window functions, provided as over (partition by ... order by ...)
projected_columns : str, default: ''

Comma-separated list of columns to include in the result.

Example

import goose

goose_conn = goose.connect()

rel = goose_conn.sql("""
        select 
            gen_random_uuid() as id, 
            concat('value is ', case when mod(range,2)=0 then 'even' else 'uneven' end) as description,
            range as value, 
            now() + concat(range,' ', 'minutes')::interval as created_timestamp
        from range(1, 10)
    """
)

rel.quantile_cont(column="value", groups="description", projected_columns="description")

Result

┌─────────────────┬──────────────────────────────────┐
│   description   │ quantile_cont("value", 0.500000) │
│     varchar     │              double              │
├─────────────────┼──────────────────────────────────┤
│ value is even   │                              5.0 │
│ value is uneven │                              5.0 │
└─────────────────┴──────────────────────────────────┘

`quantile_disc`

Signature

quantile_disc(self: _goose.GoosePyRelation, column: str, q: object = 0.5, groups: str = '', window_spec: str = '', projected_columns: str = '') -> _goose.GoosePyRelation

Description

Computes the exact quantile value for a given column

Parameters

column : str

The column name to compute the discrete quantile for.
q : object, default: 0.5

The quantile value to compute (e.g., 0.5 for median).
groups : str, default: ''

Comma-separated list of columns to include in the group by.
window_spec : str, default: ''

Optional window specification for window functions, provided as over (partition by ... order by ...)
projected_columns : str, default: ''

Comma-separated list of columns to include in the result.

Example

import goose

goose_conn = goose.connect()

rel = goose_conn.sql("""
        select 
            gen_random_uuid() as id, 
            concat('value is ', case when mod(range,2)=0 then 'even' else 'uneven' end) as description,
            range as value, 
            now() + concat(range,' ', 'minutes')::interval as created_timestamp
        from range(1, 10)
    """
)

rel.quantile_disc(column="value", groups="description", projected_columns="description")

Result

┌─────────────────┬──────────────────────────────────┐
│   description   │ quantile_disc("value", 0.500000) │
│     varchar     │              int64               │
├─────────────────┼──────────────────────────────────┤
│ value is even   │                                4 │
│ value is uneven │                                5 │
└─────────────────┴──────────────────────────────────┘

`rank`

Signature

rank(self: _goose.GoosePyRelation, window_spec: str, projected_columns: str = '') -> _goose.GoosePyRelation

Description

Computes the rank within the partition

Parameters

window_spec : str

Optional window specification for window functions, provided as over (partition by ... order by ...)
projected_columns : str, default: ''

Comma-separated list of columns to include in the result.

Example

import goose

goose_conn = goose.connect()

rel = goose_conn.sql("""
        select 
            gen_random_uuid() as id, 
            concat('value is ', case when mod(range,2)=0 then 'even' else 'uneven' end) as description,
            range as value, 
            now() + concat(range,' ', 'minutes')::interval as created_timestamp
        from range(1, 10)
    """
)

rel.rank(window_spec="over (partition by description order by value)", projected_columns="description, value")

Result

┌─────────────────┬───────┬─────────────────────────────────────────────────────────┐
│   description   │ value │ rank() OVER (PARTITION BY description ORDER BY "value") │
│     varchar     │ int64 │                          int64                          │
├─────────────────┼───────┼─────────────────────────────────────────────────────────┤
│ value is uneven │     1 │                                                       1 │
│ value is uneven │     3 │                                                       2 │
│ value is uneven │     5 │                                                       3 │
│ value is uneven │     7 │                                                       4 │
│ value is uneven │     9 │                                                       5 │
│ value is even   │     2 │                                                       1 │
│ value is even   │     4 │                                                       2 │
│ value is even   │     6 │                                                       3 │
│ value is even   │     8 │                                                       4 │
└─────────────────┴───────┴─────────────────────────────────────────────────────────┘

`rank_dense`

Signature

rank_dense(self: _goose.GoosePyRelation, window_spec: str, projected_columns: str = '') -> _goose.GoosePyRelation

Description

Computes the dense rank within the partition

Aliases: dense_rank

Parameters

window_spec : str

Optional window specification for window functions, provided as over (partition by ... order by ...)
projected_columns : str, default: ''

Comma-separated list of columns to include in the result.

Example

import goose

goose_conn = goose.connect()

rel = goose_conn.sql("""
        select 
            gen_random_uuid() as id, 
            concat('value is ', case when mod(range,2)=0 then 'even' else 'uneven' end) as description,
            range as value, 
            now() + concat(range,' ', 'minutes')::interval as created_timestamp
        from range(1, 10)
    """
)

 rel.rank_dense(window_spec="over (partition by description order by value)", projected_columns="description, value")

Result

┌─────────────────┬───────┬───────────────────────────────────────────────────────────────┐
│   description   │ value │ dense_rank() OVER (PARTITION BY description ORDER BY "value") │
│     varchar     │ int64 │                             int64                             │
├─────────────────┼───────┼───────────────────────────────────────────────────────────────┤
│ value is uneven │     1 │                                                             1 │
│ value is uneven │     3 │                                                             2 │
│ value is uneven │     5 │                                                             3 │
│ value is uneven │     7 │                                                             4 │
│ value is uneven │     9 │                                                             5 │
│ value is even   │     2 │                                                             1 │
│ value is even   │     4 │                                                             2 │
│ value is even   │     6 │                                                             3 │
│ value is even   │     8 │                                                             4 │
└─────────────────┴───────┴───────────────────────────────────────────────────────────────┘

`row_number`

Signature

row_number(self: _goose.GoosePyRelation, window_spec: str, projected_columns: str = '') -> _goose.GoosePyRelation

Description

Computes the row number within the partition

Parameters

window_spec : str

Optional window specification for window functions, provided as over (partition by ... order by ...)
projected_columns : str, default: ''

Comma-separated list of columns to include in the result.

Example

import goose

goose_conn = goose.connect()

rel = goose_conn.sql("""
        select 
            gen_random_uuid() as id, 
            concat('value is ', case when mod(range,2)=0 then 'even' else 'uneven' end) as description,
            range as value, 
            now() + concat(range,' ', 'minutes')::interval as created_timestamp
        from range(1, 10)
    """
)

rel.row_number(window_spec="over (partition by description order by value)", projected_columns="description, value")

Result

┌─────────────────┬───────┬───────────────────────────────────────────────────────────────┐
│   description   │ value │ row_number() OVER (PARTITION BY description ORDER BY "value") │
│     varchar     │ int64 │                             int64                             │
├─────────────────┼───────┼───────────────────────────────────────────────────────────────┤
│ value is uneven │     1 │                                                             1 │
│ value is uneven │     3 │                                                             2 │
│ value is uneven │     5 │                                                             3 │
│ value is uneven │     7 │                                                             4 │
│ value is uneven │     9 │                                                             5 │
│ value is even   │     2 │                                                             1 │
│ value is even   │     4 │                                                             2 │
│ value is even   │     6 │                                                             3 │
│ value is even   │     8 │                                                             4 │
└─────────────────┴───────┴───────────────────────────────────────────────────────────────┘

`select_dtypes`

Signature

select_dtypes(self: _goose.GoosePyRelation, types: object) -> _goose.GoosePyRelation

Description

Select columns from the relation, by filtering based on type(s)

Aliases: select_types

Parameters

types : object

Data type(s) to select columns by. Can be a single type or a collection of types.

Example

import goose

goose_conn = goose.connect()

rel = goose_conn.sql("""
        select 
            gen_random_uuid() as id, 
            concat('value is ', case when mod(range,2)=0 then 'even' else 'uneven' end) as description,
            range as value, 
            now() + concat(range,' ', 'minutes')::interval as created_timestamp
        from range(1, 10)
    """
)

rel.select_dtypes(types=[goose.sqltypes.VARCHAR]).distinct()

Result

┌─────────────────┐
│   description   │
│     varchar     │
├─────────────────┤
│ value is even   │
│ value is uneven │
└─────────────────┘

`select_types`

Signature

select_types(self: _goose.GoosePyRelation, types: object) -> _goose.GoosePyRelation

Description

Select columns from the relation, by filtering based on type(s)

Aliases: select_dtypes

Parameters

types : object

Data type(s) to select columns by. Can be a single type or a collection of types.

Example

import goose

goose_conn = goose.connect()

rel = goose_conn.sql("""
        select 
            gen_random_uuid() as id, 
            concat('value is ', case when mod(range,2)=0 then 'even' else 'uneven' end) as description,
            range as value, 
            now() + concat(range,' ', 'minutes')::interval as created_timestamp
        from range(1, 10)
    """
)

rel.select_types(types=[goose.sqltypes.VARCHAR]).distinct()

Result

┌─────────────────┐
│   description   │
│     varchar     │
├─────────────────┤
│ value is even   │
│ value is uneven │
└─────────────────┘

`std`

Signature

std(self: _goose.GoosePyRelation, column: str, groups: str = '', window_spec: str = '', projected_columns: str = '') -> _goose.GoosePyRelation

Description

Computes the sample standard deviation for a given column

Aliases: stddev, stddev_samp

Parameters

column : str

The column name to calculate the standard deviation for.
groups : str, default: ''

Comma-separated list of columns to include in the group by.
window_spec : str, default: ''

Optional window specification for window functions, provided as over (partition by ... order by ...)
projected_columns : str, default: ''

Comma-separated list of columns to include in the result.

Example

import goose

goose_conn = goose.connect()

rel = goose_conn.sql("""
        select 
            gen_random_uuid() as id, 
            concat('value is ', case when mod(range,2)=0 then 'even' else 'uneven' end) as description,
            range as value, 
            now() + concat(range,' ', 'minutes')::interval as created_timestamp
        from range(1, 10)
    """
)

rel.std(column="value", groups="description", projected_columns="description")

Result

┌─────────────────┬──────────────────────┐
│   description   │ stddev_samp("value") │
│     varchar     │        double        │
├─────────────────┼──────────────────────┤
│ value is uneven │   3.1622776601683795 │
│ value is even   │    2.581988897471611 │
└─────────────────┴──────────────────────┘

`stddev`

Signature

stddev(self: _goose.GoosePyRelation, column: str, groups: str = '', window_spec: str = '', projected_columns: str = '') -> _goose.GoosePyRelation

Description

Computes the sample standard deviation for a given column

Aliases: std, stddev_samp

Parameters

column : str

The column name to calculate the standard deviation for.
groups : str, default: ''

Comma-separated list of columns to include in the group by.
window_spec : str, default: ''

Optional window specification for window functions, provided as over (partition by ... order by ...)
projected_columns : str, default: ''

Comma-separated list of columns to include in the result.

Example

import goose

goose_conn = goose.connect()

rel = goose_conn.sql("""
        select 
            gen_random_uuid() as id, 
            concat('value is ', case when mod(range,2)=0 then 'even' else 'uneven' end) as description,
            range as value, 
            now() + concat(range,' ', 'minutes')::interval as created_timestamp
        from range(1, 10)
    """
)

rel.stddev(column="value", groups="description", projected_columns="description")

Result

┌─────────────────┬──────────────────────┐
│   description   │ stddev_samp("value") │
│     varchar     │        double        │
├─────────────────┼──────────────────────┤
│ value is even   │    2.581988897471611 │
│ value is uneven │   3.1622776601683795 │
└─────────────────┴──────────────────────┘

`stddev_pop`

Signature

stddev_pop(self: _goose.GoosePyRelation, column: str, groups: str = '', window_spec: str = '', projected_columns: str = '') -> _goose.GoosePyRelation

Description

Computes the population standard deviation for a given column

Parameters

column : str

The column name to calculate the standard deviation for.
groups : str, default: ''

Comma-separated list of columns to include in the group by.
window_spec : str, default: ''

Optional window specification for window functions, provided as over (partition by ... order by ...)
projected_columns : str, default: ''

Comma-separated list of columns to include in the result.

Example

import goose

goose_conn = goose.connect()

rel = goose_conn.sql("""
        select 
            gen_random_uuid() as id, 
            concat('value is ', case when mod(range,2)=0 then 'even' else 'uneven' end) as description,
            range as value, 
            now() + concat(range,' ', 'minutes')::interval as created_timestamp
        from range(1, 10)
    """
)

rel.stddev_pop(column="value", groups="description", projected_columns="description")

Result

┌─────────────────┬─────────────────────┐
│   description   │ stddev_pop("value") │
│     varchar     │       double        │
├─────────────────┼─────────────────────┤
│ value is even   │    2.23606797749979 │
│ value is uneven │  2.8284271247461903 │
└─────────────────┴─────────────────────┘

`stddev_samp`

Signature

stddev_samp(self: _goose.GoosePyRelation, column: str, groups: str = '', window_spec: str = '', projected_columns: str = '') -> _goose.GoosePyRelation

Description

Computes the sample standard deviation for a given column

Aliases: stddev, std

Parameters

column : str

The column name to calculate the standard deviation for.
groups : str, default: ''

Comma-separated list of columns to include in the group by.
window_spec : str, default: ''

Optional window specification for window functions, provided as over (partition by ... order by ...)
projected_columns : str, default: ''

Comma-separated list of columns to include in the result.

Example

import goose

goose_conn = goose.connect()

rel = goose_conn.sql("""
        select 
            gen_random_uuid() as id, 
            concat('value is ', case when mod(range,2)=0 then 'even' else 'uneven' end) as description,
            range as value, 
            now() + concat(range,' ', 'minutes')::interval as created_timestamp
        from range(1, 10)
    """
)

rel.stddev_samp(column="value", groups="description", projected_columns="description")

Result

┌─────────────────┬──────────────────────┐
│   description   │ stddev_samp("value") │
│     varchar     │        double        │
├─────────────────┼──────────────────────┤
│ value is even   │    2.581988897471611 │
│ value is uneven │   3.1622776601683795 │
└─────────────────┴──────────────────────┘

`string_agg`

Signature

string_agg(self: _goose.GoosePyRelation, column: str, sep: str = ',', groups: str = '', window_spec: str = '', projected_columns: str = '') -> _goose.GoosePyRelation

Description

Concatenates the values present in a given column with a separator

Parameters

column : str

The column name to concatenate values from.
sep : str, default: ','

Separator string to use between concatenated values.
groups : str, default: ''

Comma-separated list of columns to include in the group by.
window_spec : str, default: ''

Optional window specification for window functions, provided as over (partition by ... order by ...)
projected_columns : str, default: ''

Comma-separated list of columns to include in the result.

Example

import goose

goose_conn = goose.connect()

rel = goose_conn.sql("""
        select 
            gen_random_uuid() as id, 
            concat('value is ', case when mod(range,2)=0 then 'even' else 'uneven' end) as description,
            range as value, 
            now() + concat(range,' ', 'minutes')::interval as created_timestamp
        from range(1, 10)
    """
)

rel.string_agg(column="value", sep=",", groups="description", projected_columns="description")

Result

┌─────────────────┬──────────────────────────┐
│   description   │ string_agg("value", ',') │
│     varchar     │         varchar          │
├─────────────────┼──────────────────────────┤
│ value is even   │ 2,4,6,8                  │
│ value is uneven │ 1,3,5,7,9                │
└─────────────────┴──────────────────────────┘

`sum`

Signature

sum(self: _goose.GoosePyRelation, column: str, groups: str = '', window_spec: str = '', projected_columns: str = '') -> _goose.GoosePyRelation

Description

Computes the sum of all values present in a given column

Parameters

column : str

The column name to calculate the sum for.
groups : str, default: ''

Comma-separated list of columns to include in the group by.
window_spec : str, default: ''

Optional window specification for window functions, provided as over (partition by ... order by ...)
projected_columns : str, default: ''

Comma-separated list of columns to include in the result.

Example

import goose

goose_conn = goose.connect()

rel = goose_conn.sql("""
        select 
            gen_random_uuid() as id, 
            concat('value is ', case when mod(range,2)=0 then 'even' else 'uneven' end) as description,
            range as value, 
            now() + concat(range,' ', 'minutes')::interval as created_timestamp
        from range(1, 10)
    """
)

rel.sum(column="value", groups="description", projected_columns="description")

Result

┌─────────────────┬──────────────┐
│   description   │ sum("value") │
│     varchar     │    int128    │
├─────────────────┼──────────────┤
│ value is even   │           20 │
│ value is uneven │           25 │
└─────────────────┴──────────────┘

`unique`

Signature

unique(self: _goose.GoosePyRelation, unique_aggr: str) -> _goose.GoosePyRelation

Description

Returns the distinct values in a column.

Parameters

unique_aggr : str

The column to get the distinct values for.

Example

import goose

goose_conn = goose.connect()

rel = goose_conn.sql("""
        select 
            gen_random_uuid() as id, 
            concat('value is ', case when mod(range,2)=0 then 'even' else 'uneven' end) as description,
            range as value, 
            now() + concat(range,' ', 'minutes')::interval as created_timestamp
        from range(1, 10)
    """
)

rel.unique(unique_aggr="description")

Result

┌─────────────────┐
│   description   │
│     varchar     │
├─────────────────┤
│ value is even   │
│ value is uneven │
└─────────────────┘

`value_counts`

Signature

value_counts(self: _goose.GoosePyRelation, column: str, groups: str = '') -> _goose.GoosePyRelation

Description

Computes the number of elements present in a given column, also projecting the original column

Parameters

column : str

The column name to count values from.
groups : str, default: ''

Comma-separated list of columns to include in the group by.

Example

import goose

goose_conn = goose.connect()

rel = goose_conn.sql("""
        select 
            gen_random_uuid() as id, 
            concat('value is ', case when mod(range,2)=0 then 'even' else 'uneven' end) as description,
            range as value, 
            now() + concat(range,' ', 'minutes')::interval as created_timestamp
        from range(1, 10)
    """
)

rel.value_counts(column="description", groups="description")

Result

┌─────────────────┬────────────────────┐
│   description   │ count(description) │
│     varchar     │       int64        │
├─────────────────┼────────────────────┤
│ value is uneven │                  5 │
│ value is even   │                  4 │
└─────────────────┴────────────────────┘

`var`

Signature

var(self: _goose.GoosePyRelation, column: str, groups: str = '', window_spec: str = '', projected_columns: str = '') -> _goose.GoosePyRelation

Description

Computes the sample variance for a given column

Aliases: variance, var_samp

Parameters

column : str

The column name to calculate the sample variance for.
groups : str, default: ''

Comma-separated list of columns to include in the group by.
window_spec : str, default: ''

Optional window specification for window functions, provided as over (partition by ... order by ...)
projected_columns : str, default: ''

Comma-separated list of columns to include in the result.

Example

import goose

goose_conn = goose.connect()

rel = goose_conn.sql("""
        select 
            gen_random_uuid() as id, 
            concat('value is ', case when mod(range,2)=0 then 'even' else 'uneven' end) as description,
            range as value, 
            now() + concat(range,' ', 'minutes')::interval as created_timestamp
        from range(1, 10)
    """
)

rel.var(column="value", groups="description", projected_columns="description")

Result

┌─────────────────┬───────────────────┐
│   description   │ var_samp("value") │
│     varchar     │      double       │
├─────────────────┼───────────────────┤
│ value is even   │ 6.666666666666667 │
│ value is uneven │              10.0 │
└─────────────────┴───────────────────┘

`var_pop`

Signature

var_pop(self: _goose.GoosePyRelation, column: str, groups: str = '', window_spec: str = '', projected_columns: str = '') -> _goose.GoosePyRelation

Description

Computes the population variance for a given column

Parameters

column : str

The column name to calculate the population variance for.
groups : str, default: ''

Comma-separated list of columns to include in the group by.
window_spec : str, default: ''

Optional window specification for window functions, provided as over (partition by ... order by ...)
projected_columns : str, default: ''

Comma-separated list of columns to include in the result.

Example

import goose

goose_conn = goose.connect()

rel = goose_conn.sql("""
        select 
            gen_random_uuid() as id, 
            concat('value is ', case when mod(range,2)=0 then 'even' else 'uneven' end) as description,
            range as value, 
            now() + concat(range,' ', 'minutes')::interval as created_timestamp
        from range(1, 10)
    """
)

rel.var_pop(column="value", groups="description", projected_columns="description")

Result

┌─────────────────┬──────────────────┐
│   description   │ var_pop("value") │
│     varchar     │      double      │
├─────────────────┼──────────────────┤
│ value is even   │              5.0 │
│ value is uneven │              8.0 │
└─────────────────┴──────────────────┘

`var_samp`

Signature

var_samp(self: _goose.GoosePyRelation, column: str, groups: str = '', window_spec: str = '', projected_columns: str = '') -> _goose.GoosePyRelation

Description

Computes the sample variance for a given column

Aliases: variance, var

Parameters

column : str

The column name to calculate the sample variance for.
groups : str, default: ''

Comma-separated list of columns to include in the group by.
window_spec : str, default: ''

Optional window specification for window functions, provided as over (partition by ... order by ...)
projected_columns : str, default: ''

Comma-separated list of columns to include in the result.

Example

import goose

goose_conn = goose.connect()

rel = goose_conn.sql("""
        select 
            gen_random_uuid() as id, 
            concat('value is ', case when mod(range,2)=0 then 'even' else 'uneven' end) as description,
            range as value, 
            now() + concat(range,' ', 'minutes')::interval as created_timestamp
        from range(1, 10)
    """
)

rel.var_samp(column="value", groups="description", projected_columns="description")

Result

┌─────────────────┬───────────────────┐
│   description   │ var_samp("value") │
│     varchar     │      double       │
├─────────────────┼───────────────────┤
│ value is even   │ 6.666666666666667 │
│ value is uneven │              10.0 │
└─────────────────┴───────────────────┘

`variance`

Signature

variance(self: _goose.GoosePyRelation, column: str, groups: str = '', window_spec: str = '', projected_columns: str = '') -> _goose.GoosePyRelation

Description

Computes the sample variance for a given column

Aliases: var, var_samp

Parameters

column : str

The column name to calculate the sample variance for.
groups : str, default: ''

Comma-separated list of columns to include in the group by.
window_spec : str, default: ''

Optional window specification for window functions, provided as over (partition by ... order by ...)
projected_columns : str, default: ''

Comma-separated list of columns to include in the result.

Example

import goose

goose_conn = goose.connect()

rel = goose_conn.sql("""
        select 
            gen_random_uuid() as id, 
            concat('value is ', case when mod(range,2)=0 then 'even' else 'uneven' end) as description,
            range as value, 
            now() + concat(range,' ', 'minutes')::interval as created_timestamp
        from range(1, 10)
    """
)

rel.variance(column="value", groups="description", projected_columns="description")

Result

┌─────────────────┬───────────────────┐
│   description   │ var_samp("value") │
│     varchar     │      double       │
├─────────────────┼───────────────────┤
│ value is even   │ 6.666666666666667 │
│ value is uneven │              10.0 │
└─────────────────┴───────────────────┘

Output

This section contains the functions which will trigger an SQL execution and retrieve the data.

Name	Description
`arrow`	Execute and return an Arrow Record Batch Reader that yields all rows
`close`	Closes the result
`create`	Creates a new table named table_name with the contents of the relation object
`create_view`	Creates a view named view_name that refers to the relation object
`df`	Execute and fetch all rows as a pandas DataFrame
`execute`	Transform the relation into a result set
`fetch_arrow_reader`	Execute and return an Arrow Record Batch Reader that yields all rows
`fetch_arrow_table`	Execute and fetch all rows as an Arrow Table
`fetch_df_chunk`	Execute and fetch a chunk of the rows
`fetch_record_batch`	Execute and return an Arrow Record Batch Reader that yields all rows
`fetchall`	Execute and fetch all rows as a list of tuples
`fetchdf`	Execute and fetch all rows as a pandas DataFrame
`fetchmany`	Execute and fetch the next set of rows as a list of tuples
`fetchnumpy`	Execute and fetch all rows as a Python dict mapping each column to one numpy arrays
`fetchone`	Execute and fetch a single row as a tuple
`pl`	Execute and fetch all rows as a Polars DataFrame
`record_batch`	record_batch(self: object, batch_size: typing.SupportsInt = 1000000) -> object
`tf`	Fetch a result as dict of TensorFlow Tensors
`to_arrow_table`	Execute and fetch all rows as an Arrow Table
`to_csv`	Write the relation object to a CSV file in 'file_name'
`to_df`	Execute and fetch all rows as a pandas DataFrame
`to_parquet`	Write the relation object to a Parquet file in 'file_name'
`to_table`	Creates a new table named table_name with the contents of the relation object
`to_view`	Creates a view named view_name that refers to the relation object
`torch`	Fetch a result as dict of PyTorch Tensors
`write_csv`	Write the relation object to a CSV file in 'file_name'
`write_parquet`	Write the relation object to a Parquet file in 'file_name'

`arrow`

Signature

arrow(self: _goose.GoosePyRelation, batch_size: typing.SupportsInt = 1000000) -> pyarrow.lib.RecordBatchReader

Description

Execute and return an Arrow Record Batch Reader that yields all rows

Aliases: fetch_arrow_table, to_arrow_table

Parameters

batch_size : int, default: 1000000

The batch size of writing the data to the Arrow table

Example

import goose

goose_conn = goose.connect()

rel = goose_conn.sql("""
        select 
            gen_random_uuid() as id, 
            concat('value is ', case when mod(range,2)=0 then 'even' else 'uneven' end) as description,
            range as value, 
            now() + concat(range,' ', 'minutes')::interval as created_timestamp
        from range(1, 10)
    """
)

pa_table = rel.arrow()

pa_table

Result

pyarrow.Table
id: string
description: string
value: int64
created_timestamp: timestamp[us, tz=Europe/Amsterdam]
----
id: [["3ac9e0ba-8390-4a02-ad72-33b1caea6354","8b844392-1404-4bbc-b731-120f42c8ca27","ca5584ca-8e97-4fca-a295-ae3c16c32f5b","926d071e-5f64-488f-ae02-d19e315f9f5c","aabeedf0-5783-4eff-9963-b3967a6ea5d8","1f20db9a-bee8-4b65-b7e8-e7c36b5b8fee","795c678e-3524-4b52-96ec-7b48c24eeab1","9ffbd403-169f-4fe4-bc41-09751066f1f1","8fdb0a60-29f0-4f5b-afcc-c736a03cd083"]]
description: [["value is uneven","value is even","value is uneven","value is even","value is uneven","value is even","value is uneven","value is even","value is uneven"]]
value: [[1,2,3,4,5,6,7,8,9]]
created_timestamp: [[2025-04-10 09:07:12.614000Z,2025-04-10 09:08:12.614000Z,2025-04-10 09:09:12.614000Z,2025-04-10 09:10:12.614000Z,2025-04-10 09:11:12.614000Z,2025-04-10 09:12:12.614000Z,2025-04-10 09:13:12.614000Z,2025-04-10 09:14:12.614000Z,2025-04-10 09:15:12.614000Z]]

`close`

Signature

close(self: _goose.GoosePyRelation) -> None

Description

Closes the result

`create`

Signature

create(self: _goose.GoosePyRelation, table_name: str) -> None

Description

Creates a new table named table_name with the contents of the relation object

Aliases: to_table

Parameters

table_name : str

The name of the table to be created. There shouldn't be any other table with the same name.

Example

import goose

goose_conn = goose.connect()

rel = goose_conn.sql("""
        select 
            gen_random_uuid() as id, 
            concat('value is ', case when mod(range,2)=0 then 'even' else 'uneven' end) as description,
            range as value, 
            now() + concat(range,' ', 'minutes')::interval as created_timestamp
        from range(1, 10)
    """
)

rel.create("table_code_example")

goose_conn.table("table_code_example").limit(1)

Result

┌──────────────────────────────────────┬─────────────────┬───────┬────────────────────────────┐
│                  id                  │   description   │ value │     created_timestamp      │
│                 uuid                 │     varchar     │ int64 │  timestamp with time zone  │
├──────────────────────────────────────┼─────────────────┼───────┼────────────────────────────┤
│ 3ac9e0ba-8390-4a02-ad72-33b1caea6354 │ value is uneven │     1 │ 2025-04-10 11:07:12.614+02 │
└──────────────────────────────────────┴─────────────────┴───────┴────────────────────────────┘

`create_view`

Signature

create_view(self: _goose.GoosePyRelation, view_name: str, replace: bool = True) -> _goose.GoosePyRelation

Description

Creates a view named view_name that refers to the relation object

Aliases: to_view

Parameters

view_name : str

The name of the view to be created.
replace : bool, default: True

If the view should be created with CREATE OR REPLACE. When set to False, there shouldn't be another view with the same view_name.

Example

import goose

goose_conn = goose.connect()

rel = goose_conn.sql("""
        select 
            gen_random_uuid() as id, 
            concat('value is ', case when mod(range,2)=0 then 'even' else 'uneven' end) as description,
            range as value, 
            now() + concat(range,' ', 'minutes')::interval as created_timestamp
        from range(1, 10)
    """
)

rel.create_view("view_code_example", replace=True)

goose_conn.table("view_code_example").limit(1)

Result

┌──────────────────────────────────────┬─────────────────┬───────┬────────────────────────────┐
│                  id                  │   description   │ value │     created_timestamp      │
│                 uuid                 │     varchar     │ int64 │  timestamp with time zone  │
├──────────────────────────────────────┼─────────────────┼───────┼────────────────────────────┤
│ 3ac9e0ba-8390-4a02-ad72-33b1caea6354 │ value is uneven │     1 │ 2025-04-10 11:07:12.614+02 │
└──────────────────────────────────────┴─────────────────┴───────┴────────────────────────────┘

`df`

Signature

df(self: _goose.GoosePyRelation, *, date_as_object: bool = False) -> pandas.DataFrame

Description

Execute and fetch all rows as a pandas DataFrame

Aliases: fetchdf, to_df

Parameters

date_as_object : bool, default: False

If the date columns should be interpreted as Python date objects.

Example

import goose

goose_conn = goose.connect()

rel = goose_conn.sql("""
        select 
            gen_random_uuid() as id, 
            concat('value is ', case when mod(range,2)=0 then 'even' else 'uneven' end) as description,
            range as value, 
            now() + concat(range,' ', 'minutes')::interval as created_timestamp
        from range(1, 10)
    """
)

rel.df()

Result

                                     id      description  value                created_timestamp
0  3ac9e0ba-8390-4a02-ad72-33b1caea6354  value is uneven      1 2025-04-10 11:07:12.614000+02:00
1  8b844392-1404-4bbc-b731-120f42c8ca27    value is even      2 2025-04-10 11:08:12.614000+02:00
2  ca5584ca-8e97-4fca-a295-ae3c16c32f5b  value is uneven      3 2025-04-10 11:09:12.614000+02:00
...

`execute`

Signature

execute(self: _goose.GoosePyRelation) -> _goose.GoosePyRelation

Description

Transform the relation into a result set

Example

import goose

goose_conn = goose.connect()

rel = goose_conn.sql("""
        select 
            gen_random_uuid() as id, 
            concat('value is ', case when mod(range,2)=0 then 'even' else 'uneven' end) as description,
            range as value, 
            now() + concat(range,' ', 'minutes')::interval as created_timestamp
        from range(1, 10)
    """
)

rel.execute()

Result

┌──────────────────────────────────────┬─────────────────┬───────┬────────────────────────────┐
│                  id                  │   description   │ value │     created_timestamp      │
│                 uuid                 │     varchar     │ int64 │  timestamp with time zone  │
├──────────────────────────────────────┼─────────────────┼───────┼────────────────────────────┤
│ 3ac9e0ba-8390-4a02-ad72-33b1caea6354 │ value is uneven │     1 │ 2025-04-10 11:07:12.614+02 │
│ 8b844392-1404-4bbc-b731-120f42c8ca27 │ value is even   │     2 │ 2025-04-10 11:08:12.614+02 │
│ ca5584ca-8e97-4fca-a295-ae3c16c32f5b │ value is uneven │     3 │ 2025-04-10 11:09:12.614+02 │

`fetch_arrow_reader`

Signature

fetch_arrow_reader(self: _goose.GoosePyRelation, batch_size: typing.SupportsInt = 1000000) -> pyarrow.lib.RecordBatchReader

Description

Execute and return an Arrow Record Batch Reader that yields all rows

Parameters

batch_size : int, default: 1000000

The batch size for fetching the data.

Example

import goose

goose_conn = goose.connect()

rel = goose_conn.sql("""
        select 
            gen_random_uuid() as id, 
            concat('value is ', case when mod(range,2)=0 then 'even' else 'uneven' end) as description,
            range as value, 
            now() + concat(range,' ', 'minutes')::interval as created_timestamp
        from range(1, 10)
    """
)

pa_reader = rel.fetch_arrow_reader(batch_size=1)

pa_reader.read_next_batch()

Result

pyarrow.RecordBatch
id: string
description: string
value: int64
created_timestamp: timestamp[us, tz=Europe/Amsterdam]
----
id: ["e4ab8cb4-4609-40cb-ad7e-4304ed5ed4bd"]
description: ["value is even"]
value: [2]
created_timestamp: [2025-04-10 09:25:51.259000Z]

`fetch_arrow_table`

Signature

fetch_arrow_table(self: _goose.GoosePyRelation, batch_size: typing.SupportsInt = 1000000) -> pyarrow.lib.Table

Description

Execute and fetch all rows as an Arrow Table

Aliases: arrow, to_arrow_table

Parameters

batch_size : int, default: 1000000

The batch size for fetching the data.

Example

import goose

goose_conn = goose.connect()

rel = goose_conn.sql("""
        select 
            gen_random_uuid() as id, 
            concat('value is ', case when mod(range,2)=0 then 'even' else 'uneven' end) as description,
            range as value, 
            now() + concat(range,' ', 'minutes')::interval as created_timestamp
        from range(1, 10)
    """
)

rel.fetch_arrow_table()

Result

pyarrow.Table
id: string
description: string
value: int64
created_timestamp: timestamp[us, tz=Europe/Amsterdam]
----
id: [["1587b4b0-3023-49fe-82cf-06303ca136ac","e4ab8cb4-4609-40cb-ad7e-4304ed5ed4bd","3f8ad67a-290f-4a22-b41b-0173b8e45afa","9a4e37ef-d8bd-46dd-ab01-51cf4973549f","12baa624-ebc9-45ae-b73e-6f4029e31d2d","56d41292-53cc-48be-a1b8-e1f5d6ca5581","1accca18-c950-47c1-9108-aef8afbd5249","56d8db75-72c4-4d40-90d2-a3c840579c37","e19f6201-8646-401c-b019-e37c42c39632"]]
description: [["value is uneven","value is even","value is uneven","value is even","value is uneven","value is even","value is uneven","value is even","value is uneven"]]
value: [[1,2,3,4,5,6,7,8,9]]
created_timestamp: [[2025-04-10 09:24:51.259000Z,2025-04-10 09:25:51.259000Z,2025-04-10 09:26:51.259000Z,2025-04-10 09:27:51.259000Z,2025-04-10 09:28:51.259000Z,2025-04-10 09:29:51.259000Z,2025-04-10 09:30:51.259000Z,2025-04-10 09:31:51.259000Z,2025-04-10 09:32:51.259000Z]]

`fetch_df_chunk`

Signature

fetch_df_chunk(self: _goose.GoosePyRelation, vectors_per_chunk: typing.SupportsInt = 1, *, date_as_object: bool = False) -> pandas.DataFrame

Description

Execute and fetch a chunk of the rows

Parameters

vectors_per_chunk : int, default: 1

Number of data chunks to be processed before converting to dataframe.
date_as_object : bool, default: False

If the date columns should be interpreted as Python date objects.

Example

import goose

goose_conn = goose.connect()

rel = goose_conn.sql("""
        select 
            gen_random_uuid() as id, 
            concat('value is ', case when mod(range,2)=0 then 'even' else 'uneven' end) as description,
            range as value, 
            now() + concat(range,' ', 'minutes')::interval as created_timestamp
        from range(1, 10)
    """
)

rel.fetch_df_chunk()

Result

                                     id      description  value                created_timestamp
0  1587b4b0-3023-49fe-82cf-06303ca136ac  value is uneven      1 2025-04-10 11:24:51.259000+02:00
1  e4ab8cb4-4609-40cb-ad7e-4304ed5ed4bd    value is even      2 2025-04-10 11:25:51.259000+02:00
2  3f8ad67a-290f-4a22-b41b-0173b8e45afa  value is uneven      3 2025-04-10 11:26:51.259000+02:00
...

`fetch_record_batch`

Signature

fetch_record_batch(self: _goose.GoosePyRelation, rows_per_batch: typing.SupportsInt = 1000000) -> pyarrow.lib.RecordBatchReader

Description

Execute and return an Arrow Record Batch Reader that yields all rows

Deprecated fetch_record_batch() is deprecated since 1.4.0. Use record_batch() instead.

Aliases: record_batch

Parameters

rows_per_batch : int, default: 1000000

The number of rows per batch.

Example

import goose

goose_conn = goose.connect()

rel = goose_conn.sql("""
        select 
            gen_random_uuid() as id, 
            concat('value is ', case when mod(range,2)=0 then 'even' else 'uneven' end) as description,
            range as value, 
            now() + concat(range,' ', 'minutes')::interval as created_timestamp
        from range(1, 10)
    """
)

pa_reader = rel.fetch_record_batch(rows_per_batch=1)

pa_reader.read_next_batch()

Result

pyarrow.RecordBatch
id: string
description: string
value: int64
created_timestamp: timestamp[us, tz=Europe/Amsterdam]
----
id: ["908cf67c-a086-4b94-9017-2089a83e4a6c"]
description: ["value is uneven"]
value: [1]
created_timestamp: [2025-04-10 09:52:55.249000Z]

`fetchall`

Signature

fetchall(self: _goose.GoosePyRelation) -> list

Description

Execute and fetch all rows as a list of tuples

Example

import goose

goose_conn = goose.connect()

rel = goose_conn.sql("""
        select 
            gen_random_uuid() as id, 
            concat('value is ', case when mod(range,2)=0 then 'even' else 'uneven' end) as description,
            range as value, 
            now() + concat(range,' ', 'minutes')::interval as created_timestamp
        from range(1, 10)
    """
)

rel.limit(1).fetchall()

Result

[(UUID('1587b4b0-3023-49fe-82cf-06303ca136ac'),
  'value is uneven',
  1,
  datetime.datetime(2025, 4, 10, 11, 24, 51, 259000, tzinfo=<DstTzInfo 'Europe/Amsterdam' CEST+2:00:00 DST>))]

`fetchdf`

Signature

fetchdf(self: _goose.GoosePyRelation, *, date_as_object: bool = False) -> pandas.DataFrame

Description

Execute and fetch all rows as a pandas DataFrame

Aliases: df, to_df

Parameters

date_as_object : bool, default: False

If the date columns should be interpreted as Python date objects.

Example

import goose

goose_conn = goose.connect()

rel = goose_conn.sql("""
        select 
            gen_random_uuid() as id, 
            concat('value is ', case when mod(range,2)=0 then 'even' else 'uneven' end) as description,
            range as value, 
            now() + concat(range,' ', 'minutes')::interval as created_timestamp
        from range(1, 10)
    """
)

rel.fetchdf()

Result

                                     id      description  value                created_timestamp
0  1587b4b0-3023-49fe-82cf-06303ca136ac  value is uneven      1 2025-04-10 11:24:51.259000+02:00
1  e4ab8cb4-4609-40cb-ad7e-4304ed5ed4bd    value is even      2 2025-04-10 11:25:51.259000+02:00
2  3f8ad67a-290f-4a22-b41b-0173b8e45afa  value is uneven      3 2025-04-10 11:26:51.259000+02:00
...

`fetchmany`

Signature

fetchmany(self: _goose.GoosePyRelation, size: typing.SupportsInt = 1) -> list

Description

Execute and fetch the next set of rows as a list of tuples

Warning Executing any operation during the retrieval of the data from an aggregate relation, will close the result set.

import goose

goose_conn = goose.connect()

rel = goose_conn.sql("""
      select 
          gen_random_uuid() as id, 
          concat('value is ', case when mod(range,2)=0 then 'even' else 'uneven' end) as description,
          range as value, 
          now() + concat(range,' ', 'minutes')::interval as created_timestamp
      from range(1, 10)
   """
)

agg_rel = rel.aggregate("value")

while res := agg_rel.fetchmany(size=1):
   print(res)
   rel.show()

Parameters

size : int, default: 1

The number of records to be fetched.

Example

import goose

goose_conn = goose.connect()

rel = goose_conn.sql("""
        select 
            gen_random_uuid() as id, 
            concat('value is ', case when mod(range,2)=0 then 'even' else 'uneven' end) as description,
            range as value, 
            now() + concat(range,' ', 'minutes')::interval as created_timestamp
        from range(1, 10)
    """
)

while res := rel.fetchmany(size=1):
    print(res)

Result

[(UUID('cf4c5e32-d0aa-4699-a3ee-0092e900f263'), 'value is uneven', 1, datetime.datetime(2025, 4, 30, 16, 23, 5, 310000, tzinfo=<DstTzInfo 'Europe/Amsterdam' CEST+2:00:00 DST>))]
[(UUID('cec335ac-24ac-49a3-ae9a-bb35f71fc88d'), 'value is even', 2, datetime.datetime(2025, 4, 30, 16, 24, 5, 310000, tzinfo=<DstTzInfo 'Europe/Amsterdam' CEST+2:00:00 DST>))]
[(UUID('2423295d-9bb0-453c-a385-21bdacba03b6'), 'value is uneven', 3, datetime.datetime(2025, 4, 30, 16, 25, 5, 310000, tzinfo=<DstTzInfo 'Europe/Amsterdam' CEST+2:00:00 DST>))]
[(UUID('88806b21-192d-41e7-a293-c789aad636ba'), 'value is even', 4, datetime.datetime(2025, 4, 30, 16, 26, 5, 310000, tzinfo=<DstTzInfo 'Europe/Amsterdam' CEST+2:00:00 DST>))]
[(UUID('05837a28-dacf-4121-88a6-a374aefb8a07'), 'value is uneven', 5, datetime.datetime(2025, 4, 30, 16, 27, 5, 310000, tzinfo=<DstTzInfo 'Europe/Amsterdam' CEST+2:00:00 DST>))]
[(UUID('b9c1f7e9-6156-4554-b80e-67d3b5d810bb'), 'value is even', 6, datetime.datetime(2025, 4, 30, 16, 28, 5, 310000, tzinfo=<DstTzInfo 'Europe/Amsterdam' CEST+2:00:00 DST>))]
[(UUID('4709c7fa-d286-4864-bb48-69748b447157'), 'value is uneven', 7, datetime.datetime(2025, 4, 30, 16, 29, 5, 310000, tzinfo=<DstTzInfo 'Europe/Amsterdam' CEST+2:00:00 DST>))]
[(UUID('30e48457-b103-4fa5-95cf-1c7f0143335b'), 'value is even', 8, datetime.datetime(2025, 4, 30, 16, 30, 5, 310000, tzinfo=<DstTzInfo 'Europe/Amsterdam' CEST+2:00:00 DST>))]
[(UUID('036b7f4b-bd78-4ffb-a351-964d93f267b7'), 'value is uneven', 9, datetime.datetime(2025, 4, 30, 16, 31, 5, 310000, tzinfo=<DstTzInfo 'Europe/Amsterdam' CEST+2:00:00 DST>))]

`fetchnumpy`

Signature

fetchnumpy(self: _goose.GoosePyRelation) -> dict

Description

Execute and fetch all rows as a Python dict mapping each column to one numpy arrays

Example

import goose

goose_conn = goose.connect()

rel = goose_conn.sql("""
        select 
            gen_random_uuid() as id, 
            concat('value is ', case when mod(range,2)=0 then 'even' else 'uneven' end) as description,
            range as value, 
            now() + concat(range,' ', 'minutes')::interval as created_timestamp
        from range(1, 10)
    """
)

rel.fetchnumpy()

Result

{'id': array([UUID('1587b4b0-3023-49fe-82cf-06303ca136ac'),
        UUID('e4ab8cb4-4609-40cb-ad7e-4304ed5ed4bd'),
        UUID('3f8ad67a-290f-4a22-b41b-0173b8e45afa'),
        UUID('9a4e37ef-d8bd-46dd-ab01-51cf4973549f'),
        UUID('12baa624-ebc9-45ae-b73e-6f4029e31d2d'),
        UUID('56d41292-53cc-48be-a1b8-e1f5d6ca5581'),
        UUID('1accca18-c950-47c1-9108-aef8afbd5249'),
        UUID('56d8db75-72c4-4d40-90d2-a3c840579c37'),
        UUID('e19f6201-8646-401c-b019-e37c42c39632')], dtype=object),
 'description': array(['value is uneven', 'value is even', 'value is uneven',
        'value is even', 'value is uneven', 'value is even',
        'value is uneven', 'value is even', 'value is uneven'],
       dtype=object),
 'value': array([1, 2, 3, 4, 5, 6, 7, 8, 9]),
 'created_timestamp': array(['2025-04-10T09:24:51.259000', '2025-04-10T09:25:51.259000',
        '2025-04-10T09:26:51.259000', '2025-04-10T09:27:51.259000',
        '2025-04-10T09:28:51.259000', '2025-04-10T09:29:51.259000',
        '2025-04-10T09:30:51.259000', '2025-04-10T09:31:51.259000',
        '2025-04-10T09:32:51.259000'], dtype='datetime64[us]')}

`fetchone`

Signature

fetchone(self: _goose.GoosePyRelation) -> typing.Optional[tuple]

Description

Execute and fetch a single row as a tuple

Warning Executing any operation during the retrieval of the data from an aggregate relation, will close the result set.

import goose

goose_conn = goose.connect()

rel = goose_conn.sql("""
      select 
          gen_random_uuid() as id, 
          concat('value is ', case when mod(range,2)=0 then 'even' else 'uneven' end) as description,
          range as value, 
          now() + concat(range,' ', 'minutes')::interval as created_timestamp
      from range(1, 10)
   """
)

agg_rel = rel.aggregate("value")

while res := agg_rel.fetchone():
   print(res)
   rel.show()

Example

import goose

goose_conn = goose.connect()

rel = goose_conn.sql("""
        select 
            gen_random_uuid() as id, 
            concat('value is ', case when mod(range,2)=0 then 'even' else 'uneven' end) as description,
            range as value, 
            now() + concat(range,' ', 'minutes')::interval as created_timestamp
        from range(1, 10)
    """
)

while res := rel.fetchone():
    print(res)

Result

(UUID('fe036411-f4c7-4f52-9ddd-80cd2bb56613'), 'value is uneven', 1, datetime.datetime(2025, 4, 30, 12, 59, 8, 912000, tzinfo=<DstTzInfo 'Europe/Amsterdam' CEST+2:00:00 DST>))
(UUID('466c9b43-e9f0-4237-8f26-155f259a5b59'), 'value is even', 2, datetime.datetime(2025, 4, 30, 13, 0, 8, 912000, tzinfo=<DstTzInfo 'Europe/Amsterdam' CEST+2:00:00 DST>))
(UUID('5755cf16-a94f-41ef-a16d-21e856d71f9f'), 'value is uneven', 3, datetime.datetime(2025, 4, 30, 13, 1, 8, 912000, tzinfo=<DstTzInfo 'Europe/Amsterdam' CEST+2:00:00 DST>))
(UUID('05b52c93-bd68-45e1-b02a-a08d682c33d5'), 'value is even', 4, datetime.datetime(2025, 4, 30, 13, 2, 8, 912000, tzinfo=<DstTzInfo 'Europe/Amsterdam' CEST+2:00:00 DST>))
(UUID('cf61ef13-2840-4541-900d-f493767d7622'), 'value is uneven', 5, datetime.datetime(2025, 4, 30, 13, 3, 8, 912000, tzinfo=<DstTzInfo 'Europe/Amsterdam' CEST+2:00:00 DST>))
(UUID('033e7c68-e800-4ee8-9787-6cf50aabc27b'), 'value is even', 6, datetime.datetime(2025, 4, 30, 13, 4, 8, 912000, tzinfo=<DstTzInfo 'Europe/Amsterdam' CEST+2:00:00 DST>))
(UUID('8b8d6545-ff54-45d6-b69a-97edb63dfe43'), 'value is uneven', 7, datetime.datetime(2025, 4, 30, 13, 5, 8, 912000, tzinfo=<DstTzInfo 'Europe/Amsterdam' CEST+2:00:00 DST>))
(UUID('7da79dfe-b29c-462b-a414-9d5e3cc80139'), 'value is even', 8, datetime.datetime(2025, 4, 30, 13, 6, 8, 912000, tzinfo=<DstTzInfo 'Europe/Amsterdam' CEST+2:00:00 DST>))
(UUID('f83ffff2-33b9-4f86-9d14-46974b546bab'), 'value is uneven', 9, datetime.datetime(2025, 4, 30, 13, 7, 8, 912000, tzinfo=<DstTzInfo 'Europe/Amsterdam' CEST+2:00:00 DST>))

`pl`

Signature

pl(self: _goose.GoosePyRelation, batch_size: typing.SupportsInt = 1000000, *, lazy: bool = False) -> goose::PolarsDataFrame

Description

Execute and fetch all rows as a Polars DataFrame

Parameters

batch_size : int, default: 1000000

The number of records to be fetched per batch.

Example

import goose

goose_conn = goose.connect()

rel = goose_conn.sql("""
        select 
            gen_random_uuid() as id, 
            concat('value is ', case when mod(range,2)=0 then 'even' else 'uneven' end) as description,
            range as value, 
            now() + concat(range,' ', 'minutes')::interval as created_timestamp
        from range(1, 10)
    """
)

rel.pl(batch_size=1)

Result

shape: (9, 4)
┌─────────────────────────────────┬─────────────────┬───────┬────────────────────────────────┐
│ id                              ┆ description     ┆ value ┆ created_timestamp              │
│ ---                             ┆ ---             ┆ ---   ┆ ---                            │
│ str                             ┆ str             ┆ i64   ┆ datetime[μs, Europe/Amsterdam] │
╞═════════════════════════════════╪═════════════════╪═══════╪════════════════════════════════╡
│ b2f92c3c-9372-49f3-897f-2c86fc… ┆ value is uneven ┆ 1     ┆ 2025-04-10 11:49:51.886 CEST   │

`record_batch`

Description

record_batch(self: object, batch_size: typing.SupportsInt = 1000000) -> object

Parameters

batch_size : int, default: 1000000

The batch size for fetching the data.

Example

import goose

goose_conn = goose.connect()

rel = goose_conn.sql("""
        select 
            gen_random_uuid() as id, 
            concat('value is ', case when mod(range,2)=0 then 'even' else 'uneven' end) as description,
            range as value, 
            now() + concat(range,' ', 'minutes')::interval as created_timestamp
        from range(1, 10)
    """
)

pa_batch = rel.record_batch(batch_size=1)

pa_batch.read_next_batch()

Result

pyarrow.RecordBatch
id: string
description: string
value: int64
created_timestamp: timestamp[us, tz=Europe/Amsterdam]
----
id: ["908cf67c-a086-4b94-9017-2089a83e4a6c"]
description: ["value is uneven"]
value: [1]
created_timestamp: [2025-04-10 09:52:55.249000Z]

`tf`

Signature

tf(self: _goose.GoosePyRelation) -> dict

Description

Fetch a result as dict of TensorFlow Tensors

Example

import goose

goose_conn = goose.connect()

rel = goose_conn.sql("""
        select 
            gen_random_uuid() as id, 
            concat('value is ', case when mod(range,2)=0 then 'even' else 'uneven' end) as description,
            range as value, 
            now() + concat(range,' ', 'minutes')::interval as created_timestamp
        from range(1, 10)
    """
)

rel.select("description, value").tf()

Result

{'description': <tf.Tensor: shape=(9,), dtype=string, numpy=
 array([b'value is uneven', b'value is even', b'value is uneven',
        b'value is even', b'value is uneven', b'value is even',
        b'value is uneven', b'value is even', b'value is uneven'],
       dtype=object)>,
 'value': <tf.Tensor: shape=(9,), dtype=int64, numpy=array([1, 2, 3, 4, 5, 6, 7, 8, 9])>}

`to_arrow_table`

Signature

to_arrow_table(self: _goose.GoosePyRelation, batch_size: typing.SupportsInt = 1000000) -> pyarrow.lib.Table

Description

Execute and fetch all rows as an Arrow Table

Aliases: fetch_arrow_table, arrow

Parameters

batch_size : int, default: 1000000

The batch size for fetching the data.

Example

import goose

goose_conn = goose.connect()

rel = goose_conn.sql("""
        select 
            gen_random_uuid() as id, 
            concat('value is ', case when mod(range,2)=0 then 'even' else 'uneven' end) as description,
            range as value, 
            now() + concat(range,' ', 'minutes')::interval as created_timestamp
        from range(1, 10)
    """
)

rel.to_arrow_table()

Result

pyarrow.Table
id: string
description: string
value: int64
created_timestamp: timestamp[us, tz=Europe/Amsterdam]
----
id: [["86b2011d-3818-426f-a41e-7cd5c7321f79","07fa4f89-0bba-4049-9acd-c933332a66d5","f2f1479e-f582-4fe4-b82f-9b753b69634c","529d3c63-5961-4adb-b0a8-8249188fc82a","aa9eea7d-7fac-4dcf-8f32-4a0b5d64f864","4852aa32-03f2-40d3-8006-b8213904775a","c0127203-f2e3-4925-9810-655bc02a3c19","2a1356ba-5707-44d6-a492-abd0a67e5efb","800a1c24-231c-4dae-bd68-627654c8a110"]]
description: [["value is uneven","value is even","value is uneven","value is even","value is uneven","value is even","value is uneven","value is even","value is uneven"]]
value: [[1,2,3,4,5,6,7,8,9]]
created_timestamp: [[2025-04-10 09:54:24.015000Z,2025-04-10 09:55:24.015000Z,2025-04-10 09:56:24.015000Z,2025-04-10 09:57:24.015000Z,2025-04-10 09:58:24.015000Z,2025-04-10 09:59:24.015000Z,2025-04-10 10:00:24.015000Z,2025-04-10 10:01:24.015000Z,2025-04-10 10:02:24.015000Z]]

`to_csv`

Signature

to_csv(self: _goose.GoosePyRelation, file_name: str, *, sep: object = None, na_rep: object = None, header: object = None, quotechar: object = None, escapechar: object = None, date_format: object = None, timestamp_format: object = None, quoting: object = None, encoding: object = None, compression: object = None, overwrite: object = None, per_thread_output: object = None, use_tmp_file: object = None, partition_by: object = None, write_partition_columns: object = None) -> None

Description

Write the relation object to a CSV file in 'file_name'

Aliases: write_csv

Parameters

file_name : str

The name of the output CSV file.
sep : str, default: ','

Field delimiter for the output file.
na_rep : str, default: ''

Missing data representation.
header : bool, default: True

Whether to write column headers.
quotechar : str, default: '"'

Character used to quote fields containing special characters.
escapechar : str, default: None

Character used to escape the delimiter if quoting is set to QUOTE_NONE.
date_format : str, default: None

Custom format string for DATE values.
timestamp_format : str, default: None

Custom format string for TIMESTAMP values.
quoting : int, default: csv.QUOTE_MINIMAL

Control field quoting behavior (e.g., QUOTE_MINIMAL, QUOTE_ALL).
encoding : str, default: 'utf-8'

Character encoding for the output file.
compression : str, default: auto

Compression type (e.g., 'gzip', 'bz2', 'zstd').
overwrite : bool, default: False

When true, all existing files inside targeted directories will be removed (not supported on remote filesystems). Only has an effect when used with partition_by.
per_thread_output : bool, default: False

When true, write one file per thread, rather than one file in total. This allows for faster parallel writing.
use_tmp_file : bool, default: False

Write to a temporary file before renaming to final name to avoid partial writes.
partition_by : list[str], default: None

List of column names to partition output by (creates folder structure).
write_partition_columns : bool, default: False

Whether or not to write partition columns into files. Only has an effect when used with partition_by.

Example

import goose

goose_conn = goose.connect()

rel = goose_conn.sql("""
        select 
            gen_random_uuid() as id, 
            concat('value is ', case when mod(range,2)=0 then 'even' else 'uneven' end) as description,
            range as value, 
            now() + concat(range,' ', 'minutes')::interval as created_timestamp
        from range(1, 10)
    """
)

rel.to_csv("code_example.csv")

Result

The data is exported to a CSV file, named code_example.csv

`to_df`

Signature

to_df(self: _goose.GoosePyRelation, *, date_as_object: bool = False) -> pandas.DataFrame

Description

Execute and fetch all rows as a pandas DataFrame

Aliases: fetchdf, df

Parameters

date_as_object : bool, default: False

If the date columns should be interpreted as Python date objects.

Example

import goose

goose_conn = goose.connect()

rel = goose_conn.sql("""
        select 
            gen_random_uuid() as id, 
            concat('value is ', case when mod(range,2)=0 then 'even' else 'uneven' end) as description,
            range as value, 
            now() + concat(range,' ', 'minutes')::interval as created_timestamp
        from range(1, 10)
    """
)

rel.to_df()

Result

                                     id      description  value                created_timestamp
0  e1f79925-60fd-4ee2-ae67-5eff6b0543d1  value is uneven      1 2025-04-10 11:56:04.452000+02:00
1  caa619d4-d79c-4c00-b82e-9319b086b6f8    value is even      2 2025-04-10 11:57:04.452000+02:00
2  64c68032-99b9-4e8f-b4a3-6c522d5419b3  value is uneven      3 2025-04-10 11:58:04.452000+02:00
...

`to_parquet`

Signature

to_parquet(self: _goose.GoosePyRelation, file_name: str, *, compression: object = None, field_ids: object = None, row_group_size_bytes: object = None, row_group_size: object = None, overwrite: object = None, per_thread_output: object = None, use_tmp_file: object = None, partition_by: object = None, write_partition_columns: object = None, append: object = None, filename_pattern: object = None, file_size_bytes: object = None) -> None

Description

Write the relation object to a Parquet file in 'file_name'

Aliases: write_parquet

Parameters

file_name : str

The name of the output Parquet file.
compression : str, default: 'snappy'

The compression format to use (uncompressed, snappy, gzip, zstd, brotli, lz4, lz4_raw).
field_ids : STRUCT

The field_id for each column. Pass auto to attempt to infer automatically.
row_group_size_bytes : int, default: row_group_size * 1024

The target size of each row group. You can pass either a human-readable string, e.g., 2MB, or an integer, i.e., the number of bytes. This option is only used when you have issued SET preserve_insertion_order = false;, otherwise, it is ignored.
row_group_size : int, default: 122880

The target size, i.e., number of rows, of each row group.
overwrite : bool, default: False

If True, overwrite the file if it exists.
per_thread_output : bool, default: False

When True, write one file per thread, rather than one file in total. This allows for faster parallel writing.
use_tmp_file : bool, default: False

Write to a temporary file before renaming to final name to avoid partial writes.
partition_by : list[str], default: None

List of column names to partition output by (creates folder structure).
write_partition_columns : bool, default: False

Whether or not to write partition columns into files. Only has an effect when used with partition_by.
append : bool, default: False

When True, in the event a filename pattern is generated that already exists, the path will be regenerated to ensure no existing files are overwritten. Only has an effect when used with partition_by.

Example

import goose

goose_conn = goose.connect()

rel = goose_conn.sql("""
        select 
            gen_random_uuid() as id, 
            concat('value is ', case when mod(range,2)=0 then 'even' else 'uneven' end) as description,
            range as value, 
            now() + concat(range,' ', 'minutes')::interval as created_timestamp
        from range(1, 10)
    """
)

rel.to_parquet("code_example.parquet")

Result

The data is exported to a Parquet file, named code_example.parquet

`to_table`

Signature

to_table(self: _goose.GoosePyRelation, table_name: str) -> None

Description

Creates a new table named table_name with the contents of the relation object

Aliases: create

Parameters

table_name : str

The name of the table to be created. There shouldn't be any other table with the same name.

Example

import goose

goose_conn = goose.connect()

rel = goose_conn.sql("""
        select 
            gen_random_uuid() as id, 
            concat('value is ', case when mod(range,2)=0 then 'even' else 'uneven' end) as description,
            range as value, 
            now() + concat(range,' ', 'minutes')::interval as created_timestamp
        from range(1, 10)
    """
)

rel.to_table("table_code_example")

Result

A table, named table_code_example, is created with the data of the relation

`to_view`

Signature

to_view(self: _goose.GoosePyRelation, view_name: str, replace: bool = True) -> _goose.GoosePyRelation

Description

Creates a view named view_name that refers to the relation object

Aliases: create_view

Parameters

view_name : str

The name of the view to be created.
replace : bool, default: True

If the view should be created with CREATE OR REPLACE. When set to False, there shouldn't be another view with the same view_name.

Example

import goose

goose_conn = goose.connect()

rel = goose_conn.sql("""
        select 
            gen_random_uuid() as id, 
            concat('value is ', case when mod(range,2)=0 then 'even' else 'uneven' end) as description,
            range as value, 
            now() + concat(range,' ', 'minutes')::interval as created_timestamp
        from range(1, 10)
    """
)

rel.to_view("view_code_example", replace=True)

Result

A view, named view_code_example, is created with the query definition of the relation

`torch`

Signature

torch(self: _goose.GoosePyRelation) -> dict

Description

Fetch a result as dict of PyTorch Tensors

Example

import goose

goose_conn = goose.connect()

rel = goose_conn.sql("""
        select 
            gen_random_uuid() as id, 
            concat('value is ', case when mod(range,2)=0 then 'even' else 'uneven' end) as description,
            range as value, 
            now() + concat(range,' ', 'minutes')::interval as created_timestamp
        from range(1, 10)
    """
)

rel.select("value").torch()

Result

{'value': tensor([1, 2, 3, 4, 5, 6, 7, 8, 9])}

`write_csv`

Signature

write_csv(self: _goose.GoosePyRelation, file_name: str, *, sep: object = None, na_rep: object = None, header: object = None, quotechar: object = None, escapechar: object = None, date_format: object = None, timestamp_format: object = None, quoting: object = None, encoding: object = None, compression: object = None, overwrite: object = None, per_thread_output: object = None, use_tmp_file: object = None, partition_by: object = None, write_partition_columns: object = None) -> None

Description

Write the relation object to a CSV file in 'file_name'

Aliases: to_csv

Parameters

file_name : str

The name of the output CSV file.
sep : str, default: ','

Field delimiter for the output file.
na_rep : str, default: ''

Missing data representation.
header : bool, default: True

Whether to write column headers.
quotechar : str, default: '"'

Character used to quote fields containing special characters.
escapechar : str, default: None

Character used to escape the delimiter if quoting is set to QUOTE_NONE.
date_format : str, default: None

Custom format string for DATE values.
timestamp_format : str, default: None

Custom format string for TIMESTAMP values.
quoting : int, default: csv.QUOTE_MINIMAL

Control field quoting behavior (e.g., QUOTE_MINIMAL, QUOTE_ALL).
encoding : str, default: 'utf-8'

Character encoding for the output file.
compression : str, default: auto

Compression type (e.g., 'gzip', 'bz2', 'zstd').
overwrite : bool, default: False

When true, all existing files inside targeted directories will be removed (not supported on remote filesystems). Only has an effect when used with partition_by.
per_thread_output : bool, default: False

When true, write one file per thread, rather than one file in total. This allows for faster parallel writing.
use_tmp_file : bool, default: False

Write to a temporary file before renaming to final name to avoid partial writes.
partition_by : list[str], default: None

List of column names to partition output by (creates folder structure).
write_partition_columns : bool, default: False

Whether or not to write partition columns into files. Only has an effect when used with partition_by.

Example

import goose

goose_conn = goose.connect()

rel = goose_conn.sql("""
        select 
            gen_random_uuid() as id, 
            concat('value is ', case when mod(range,2)=0 then 'even' else 'uneven' end) as description,
            range as value, 
            now() + concat(range,' ', 'minutes')::interval as created_timestamp
        from range(1, 10)
    """
)

rel.write_csv("code_example.csv")

Result

The data is exported to a CSV file, named code_example.csv

`write_parquet`

Signature

write_parquet(self: _goose.GoosePyRelation, file_name: str, *, compression: object = None, field_ids: object = None, row_group_size_bytes: object = None, row_group_size: object = None, overwrite: object = None, per_thread_output: object = None, use_tmp_file: object = None, partition_by: object = None, write_partition_columns: object = None, append: object = None, filename_pattern: object = None, file_size_bytes: object = None) -> None

Description

Write the relation object to a Parquet file in 'file_name'

Aliases: to_parquet

Parameters

file_name : str

The name of the output Parquet file.
compression : str, default: 'snappy'

The compression format to use (uncompressed, snappy, gzip, zstd, brotli, lz4, lz4_raw).
field_ids : STRUCT

The field_id for each column. Pass auto to attempt to infer automatically.
row_group_size_bytes : int, default: row_group_size * 1024

The target size of each row group. You can pass either a human-readable string, e.g., 2MB, or an integer, i.e., the number of bytes. This option is only used when you have issued SET preserve_insertion_order = false;, otherwise, it is ignored.
row_group_size : int, default: 122880

The target size, i.e., number of rows, of each row group.
overwrite : bool, default: False

If True, overwrite the file if it exists.
per_thread_output : bool, default: False

When True, write one file per thread, rather than one file in total. This allows for faster parallel writing.
use_tmp_file : bool, default: False

Write to a temporary file before renaming to final name to avoid partial writes.
partition_by : list[str], default: None

List of column names to partition output by (creates folder structure).
write_partition_columns : bool, default: False

Whether or not to write partition columns into files. Only has an effect when used with partition_by.
append : bool, default: False

When True, in the event a filename pattern is generated that already exists, the path will be regenerated to ensure no existing files are overwritten. Only has an effect when used with partition_by.

Example

import goose

goose_conn = goose.connect()

rel = goose_conn.sql("""
        select 
            gen_random_uuid() as id, 
            concat('value is ', case when mod(range,2)=0 then 'even' else 'uneven' end) as description,
            range as value, 
            now() + concat(range,' ', 'minutes')::interval as created_timestamp
        from range(1, 10)
    """
)

rel.write_parquet("code_example.parquet")

Result

The data is exported to a Parquet file, named code_example.parquet

Lazy Evaluation​

Relation Creation​

from_arrow​

Signature​

Description​

Parameters​

Example​

Result​

from_csv_auto​

Signature​

Description​

Parameters​

Example​

Result​

from_df​

Signature​

Description​

Parameters​

Example​

Result​

from_parquet​

Signature​

Description​

Parameters​

Example​

Result​

from_query​

Signature​

Description​

Parameters​

Example​

Result​

query​

Signature​

Description​

Parameters​

Example​

Result​

read_csv​

Signature​

Description​

Parameters​

Example​

Result​

read_json​

Signature​

Description​

Parameters​

Example​

Result​

read_parquet​

Signature​

Description​

Parameters​

Example​

Result​

sql​

Signature​

Description​

Parameters​

Example​

Result​

table​

Signature​

Description​

Parameters​

Example​

Result​

table_function​

Signature​

Description​

Parameters​

Example​

Result​

values​

Signature​

Description​

Example​

Result​

view​

Lazy Evaluation

Relation Creation

`from_arrow`

Signature

Description

Parameters

Example

Result

`from_csv_auto`

Signature

Description

Parameters

Example

Result

`from_df`

Signature

Description

Parameters

Example

Result

`from_parquet`

Signature

Description

Parameters

Example

Result

`from_query`

Signature

Description

Parameters

Example

Result

`query`

Signature

Description

Parameters

Example

Result

`read_csv`

Signature

Description

Parameters

Example

Result

`read_json`

Signature

Description

Parameters

Example

Result

`read_parquet`

Signature

Description

Parameters

Example

Result

`sql`

Signature

Description

Parameters

Example

Result

`table`

Signature

Description

Parameters

Example

Result

`table_function`

Signature

Description

Parameters

Example

Result

`values`

Signature

Description

Example

Result

`view`