tdw_catalog
- class tdw_catalog.Catalog(*args, **kwargs)[source]
A
Catalog
is the primary client object for a ThinkData CatalogParameters
- api_keyOptional[str])
An optional api key for the Catalog platform. This parameter must be supplied, containing your personal API key for the Catalog platform, and can only be omitted when supplied via an environment variable CATALOG_API_KEY instead.
- auth_urlOptional[str])
An optional auth url for the Catalog platform. This parameter must only be supplied when connecting to a dedicated Catalog deployment, and can be populated via an envrionment variable CATALOG_AUTH_URL instead. Defaults to the auth url for the ThinkData Works SaaS Catalog platform (https://account.ee.namara.io).
- api_urlOptional[str])
An optional API url for the Catalog platform. This parameter must only be supplied when connecting to a dedicated Catalog deployment, and can be populated via an envrionment variable CATALOG_API_URL instead. Defaults to the API url for the ThinkData Works SaaS Catalog platform (https://api.ee.namara.io).
- create_organization(title: str) organization.Organization [source]
Creates an
Organization
Parameters
- titlestr
The title for the new
Organization
Returns
- Organization
The created
Organization
Raises
- CatalogException
If there is an issue communicating with the
Catalog
server, or an issue with the server itself
- get_organization(id: str) organization.Organization [source]
Retrieve a specific
Organization
Parameters
- idstr
The UUID of the
Organization
Returns
- Organization
The
Organization
which has the provided id, if it exists and the caller is a member of it
Raises
- CatalogPermissionDeniedException
If the caller is not a member of the given
Organization
, or if it does not exist- CatalogException
If there is an issue communicating with the
Catalog
server, or an issue with the server itself
- list_organizations(filter: ListOrganizationsFilter | None = None) List[organization.Organization] [source]
Retrieve the list of
Organization
s to which the caller belongs.Parameters
- filterListOrganizationsFilter
An optional filter on the returned
Organization
s (None by default).
Returns
- list[Organization]
The list of
Organization
s to which the caller belongs, ordered by title (ascending).
Raises
- CatalogException
If there is an issue communicating with the
Catalog
server, or an issue with the server itself
connection
- class tdw_catalog.connection.ConnectionSchedule(interval: HourlyInterval | DailyInterval | WeeklyInterval | MonthlyInterval | YearlyInterval, timezone: str)[source]
Bases:
object
A
ConnectionSchedule
describes the frequency with which to reingest ingested data, or re-analyze virtualized dataAttributes
- interval: HourlyInterval | DailyInterval | WeeklyInterval | MonthlyInterval | YearlyInterval
The interval that this schedule represents
- timezone: str
The timezone in which to interpret times in the interval
- class tdw_catalog.connection.DailyInterval(minute: int, hour: int)[source]
Bases:
HourlyInterval
A DailyInterval interval causes a
ConnectionSchedule
to execute at a specific minute and hour each dayAttributes
- minuteint
The minute of the hour to execute at, between 0 and 59
- hourint
The hour of the day to execute at, between 0 and 23
- class tdw_catalog.connection.HourlyInterval(minute: int)[source]
Bases:
object
An hourly interval causes a
ConnectionSchedule
to execute at a specific minute every hour.Attributes
- minuteint
The minute of the hour to execute at, between 0 and 59
- class tdw_catalog.connection.IngestionConnection(client, **kwargs)[source]
Bases:
_Connection
IngestionConnection
s are used to attach ingested data to aDataset
, describing the mechanism and necessary credentials for accessing said data. Data is ingested via anIngestionConnection
: pulled from an uploaded file, or a remote location such as a cloud storage bucket.Attributes
- idstr
IngestionConnection
‘s unique id- source_idstr
The unique ID of the
Source
to which thisIngestionConnection
belongs- sourceSource
The
Source
associated with thisIngestionConnection
. ASource
orsource_id
can be provided but not both.- user_idstr
The unique
User
ID of the user who created thisIngestionConnection
- labelstr
The descriptive label for this
IngestionConnection
- descriptionOptional[str] = None
An optional extended description for this
IngestionConnection
- portalConnectionPortalType
The method of data access employed by this
IngestionConnection
- urlstr
A canonical URL that points to the location of data resources within the portal
- warehouseOptional[str]
Dataset
s created using thisIngestionConnection
will ingest to thisWarehouse
by default (can be overriden at ingest time).- credential_idOptional[str]
The
Credential
ID that should be used along with the portal to accessDataset
s when ingesting.- credentialOptional[credential.Credential]
The
Credential
associated with thisIngestionConnection
. Omitted when virtualizing. ACredential
orcredential_id
can be provided but not both.- ingest_schedulesOptional[List[ConnectionSchedule]]
Optional
ConnectionSchedule
s which, when specified, indicate the frequency with which to reingest ingested data. SpecificDataset
s using thisIngestionConnection
may override this set ofConnectionSchedule
s.- disabledOptional[bool]
When true, disables the schedule on this
IngestionConnection
. TheIngestionConnection
itself can still be used for manual ingestion or data virtualization.- created_atdatetime
The datetime at which this
IngestionConnection
was created- updated_atdatetime
The datetime at which this
IngestionConnection
was last updated
- class tdw_catalog.connection.MonthlyInterval(minute: int, hour: int, dayOfMonth: int)[source]
Bases:
DailyInterval
A MonthlyInterval interval causes a
ConnectionSchedule
to execute on a specific day of the month, at a specific minute+hour, every month.Attributes
- minuteint
The minute of the hour to execute at, between 0 and 59
- hourint
The hour of the day to execute at, between 0 and 23
- dayOfMonthint
The day of the week to execute at, beginning on Sunday, between 1 and 31, or “-1” for the last day of each month
- class tdw_catalog.connection.VirtualizationConnection(client, **kwargs)[source]
Bases:
_Connection
VirtualizationConnection
s are used to attach virtualized data to aDataset
, describing the mechanism and necessary credentials for accessing said data. Data is accessed from a remote location without being copied into the platform.Attributes
- idstr
IngestionConnection
‘s unique id- source_idstr
The unique ID of the
Source
to which thisIngestionConnection
belongs- sourceSource
The
Source
associated with thisIngestionConnection
. ASource
orsource_id
can be provided but not both.- user_idstr
The unique
User
ID of the user who created thisIngestionConnection
- labelstr
The descriptive label for this
IngestionConnection
- descriptionOptional[str] = None
An optional extended description for this
IngestionConnection
- portalConnectionPortalType
The method of data access employed by this
IngestionConnection
- urlstr
A canonical URL that points to the location of data resources within the portal
- warehouseOptional[str]
Virtualized datasets created using this
IngestionConnection
will always access data from thisWarehouse
(must be suplied for virtualization). Non-virtualized datasets created using thisIngestionConnection
will ingest to thisWarehouse
by default (can be overriden at ingest time).- credential_idOptional[str]
The
Credential
ID that should be used along with the portal to accessDataset
s when ingesting. Omitted when virtualizing.- credentialOptional[credential.Credential]
The
Credential
associated with thisIngestionConnection
. Omitted when virtualizing. ACredential
orcredential_id
can be provided but not both.- default_schema: str
The schema to search for tables and views
- metrics_collection_schedulesOptional[List[ConnectionSchedule]]
Optional
ConnectionSchedule
s which, when specified, indicate the frequency with which to re-analyze virtualized data. SpecificDataset
s using thisVirtualizationConnection
may override this set ofConnectionSchedule
s.- disabledOptional[bool]
When true, disables the schedule on this
IngestionConnection
. TheIngestionConnection
itself can still be used for manual ingestion or data virtualization.- created_atdatetime
The datetime at which this
IngestionConnection
was created- updated_atdatetime
The datetime at which this
IngestionConnection
was last updated
- class tdw_catalog.connection.WeeklyInterval(minute: int, hour: int, dayOfWeek: int)[source]
Bases:
DailyInterval
A WeelyInterval interval causes a
ConnectionSchedule
to execute on a specific day of the week, at a specific minute+hour, every week.Attributes
- minuteint
The minute of the hour to execute at, between 0 and 59
- hourint
The hour of the day to execute at, between 0 and 23
- dayOfWeek: int
The day of the week beginning on Sunday, between 0 and 6
- class tdw_catalog.connection.YearlyInterval(minute: int, hour: int, dayOfMonth: int, month: int)[source]
Bases:
MonthlyInterval
A MonthlyInterval interval causes a
ConnectionSchedule
to execute on a specific day of a specific month, at a specific minute+hour, every year.Attributes
- minuteint
The minute of the hour to execute at, between 0 and 59
- hourint
The hour of the day to execute at, between 0 and 23
- dayOfMonthint
The day of the week to execute at, beginning on Sunday, between 1 and 31, or “-1” for the last day of each month
- monthint
The month of the year to execute at, between 1 and 12
credential
- class tdw_catalog.credential.CatalogCredential(client, **kwargs)[source]
Bases:
Credential
A
CatalogCredential
permits aSource
to access datasets which exist on another ThinkData WorksCatalog
server.Attributes
- catalog_api_keystr
The API key for the target
Catalog
. Can be updated, but not read.
- class tdw_catalog.credential.Credential(client, **kwargs)[source]
Bases:
EntityBase
,_OrganizationRelation
Credential
s are used in conjunction withSource
s to ingest data intoDataset
sParameters
- idstr
Credential
‘s unique id- organization_idstr
The unique ID of the
Organization
to which thisCredential
belongs- user_idstr
The unique user ID of the user who created this
Credential
- namestr
A name for this
Credential
- descriptionstr
The Optional description of this
Credential
- created_atdatetime
The datetime at which this
Credential
was created- updated_atdatetime
The datetime at which this
Credential
was last updated
- delete() None [source]
Delete this
Credential
from the user. ThisCredential
object should not be used after delete() returns successfully.Parameters
None
Raises
- CatalogPermissionDeniedException
If the caller is not allowed to update this
Credential
- CatalogInvalidArgumentException
If the given
Credential
does not exist- CatalogException
If call to the
Catalog
server fails
- classmethod get(client: Catalog, organization_id: str, id: str)[source]
Retrieve a
Credential
belonging to anOrganization
Parameters
- clientcatalog.Client
The
Catalog
client to use to get theCredential
- organization_idstr
The unique ID of the
Organization
- idstr
The unique ID of the
Credential
Returns
- Credential
The
Credential
associated with the given ID
- save() None [source]
Update this
Credential
, saving any changes to its name, description or type-specific fields.Parameters
None
Returns
None
Raises
- CatalogPermissionDeniedException
If the caller is not allowed to update this
Credential
- CatalogException
If call to the
Catalog
server fails
- class tdw_catalog.credential.CredentialFactory(client: Catalog, organization_id: str)[source]
Bases:
object
A
CredentialFactory
creates specific types ofCredential
s within a specificOrganization
- catalog_credential(name: str, description: str | None, catalog_api_key: str) CatalogCredential [source]
Constructs a
CatalogCredential
Parameters
- namestr
A name for this
Credential
- descriptionOptional[str]
The Optional description of this
Credential
- catalog_api_keystr
The API key for the target
Catalog
Returns
- CatalogCredential
The created
CatalogCredential
Raises
- CatalogPermissionDeniedException
If the caller is not allowed to create
Credential
s- CatalogInvalidArgumentException
If one or more of the given credential parameters are invalid
- CatalogException
If call to the
Catalog
server fails
- ftp_credential(name: str, description: str | None, username: str, password: str) FTPCredential [source]
Constructs an
FTPCredential
Parameters
- namestr
A name for this
Credential
- descriptionOptional[str]
The Optional description of this
Credential
- username: str
The username for the target FTP server
- password: str
The password for the target FTP server
Returns
- FTPCredential
The created FTPCredential
Raises
- CatalogPermissionDeniedException
If the caller is not allowed to create
Credential
s- CatalogInvalidArgumentException
If one or more of the given credential parameters are invalid
- CatalogException
If call to the
Catalog
server fails
- google_storage_credential(name: str, description: str | None, region: str, project: str, client_secrets: str) GoogleStorageCredential [source]
Constructs a
GoogleStorageCredential
Parameters
- namestr
A name for this
Credential
- descriptionOptional[str]
The Optional description of this
Credential
- projectstr
The name of the Google Cloud project in which the bucket can be found
- regionstr
The Google Cloud region in which the bucket can be found (e.g. us-central1)
- client_secretsstr
The client secrets for the Google Storage bucket. Can be updated, but not read.
Returns
- GoogleStorageCredential
The created
GoogleStorageCredential
Raises
- CatalogPermissionDeniedException
If the caller is not allowed to create
Credential
s- CatalogInvalidArgumentException
If one or more of the given credential parameters are invalid
- CatalogException
If call to the
Catalog
server fails
- s3_credential(name: str, description: str | None, region: str, access_key_id: str, secret_access_key: str) S3Credential [source]
Constructs as
S3Credential
Parameters
- namestr
A name for this
Credential
- descriptionOptional[str]
The Optional description of this
Credential
- regionstr
The AWS S3 region in which the bucket resides
- access_key_idstr
The AWS Access Key for the S3 bucket. Can be updated but not read.
- secret_access_keystr
The AWS Secret Access Key for the S3 bucket. Can be updated but not read.
Returns
- S3Credential
The created
S3Credential
Raises
- CatalogPermissionDeniedException
If the caller is not allowed to create
Credential
s- CatalogInvalidArgumentException
If one or more of the given credential parameters are invalid
- CatalogException
If call to the
Catalog
server fails
- sftp_with_key_credential(name: str, description: str | None, username: str, ssh_key: str) SFTPCredential [source]
Constructs a key-based
SFTPCredential
Parameters
- namestr
A name for this
Credential
- descriptionOptional[str]
The Optional description of this
Credential
- username: str
The username for the target SFTP server
- ssh_key: str
The ssh_key for the target SFTP server
Returns
- SFTPCredential
The created
SFTPCredential
Raises
- CatalogPermissionDeniedException
If the caller is not allowed to create
Credential
s- CatalogInvalidArgumentException
If one or more of the given credential parameters are invalid
- CatalogException
If call to the
Catalog
server fails
- sftp_with_password_credential(name: str, description: str | None, username: str, password: str) SFTPCredential [source]
Constructs a password-based
SFTPCredential
Parameters
- namestr
A name for this
Credential
- descriptionOptional[str]
The Optional description of this
Credential
- username: str
The username for the target SFTP server
- password: str
The password for the target SFTP server
Returns
- SFTPCredential
The created
SFTPCredential
Raises
- CatalogPermissionDeniedException
If the caller is not allowed to create
Credential
s- CatalogInvalidArgumentException
If one or more of the given credential parameters are invalid
- CatalogException
If call to the
Catalog
server fails
- class tdw_catalog.credential.FTPCredential(client, **kwargs)[source]
Bases:
Credential
An
FTPCredential
permits aSource
to access data stored on an FTP server.Attributes
- username: str
The username for the target FTP server
- password: str
The password for the target FTP server. Can be updated, but not read.
- class tdw_catalog.credential.GoogleStorageCredential(client, **kwargs)[source]
Bases:
Credential
A
GoogleStorageCredential
permits aSource
to access data stored in a Google Storage (GS) bucket.Attributes
- projectstr
The name of the Google Cloud project in which the bucket can be found
- regionstr
The Google Cloud region in which the bucket can be found (e.g. us-central1)
- client_secretsstr
The client secrets for the Google Storage bucket. Can be updated, but not read.
- class tdw_catalog.credential.S3Credential(client, **kwargs)[source]
Bases:
Credential
An
S3Credential
permits aSource
to access data stored in an AWS S3 (or other S3-compatible) bucket.Attributes
- regionstr
The AWS S3 region in which the bucket resides
- access_key_idstr
The AWS Access Key for the S3 bucket. Can be updated but not read.
- secret_access_keystr
The AWS Secret Access Key for the S3 bucket. Can be updated but not read.
- class tdw_catalog.credential.SFTPCredential(client, **kwargs)[source]
Bases:
Credential
An
SFTPCredential
permits aSource
to access data stored on an SFTP server.Attributes
- username: str
The username for the target FTP server
- password: str
The password for the target FTP server. Can be updated, but not read.
- ssh_key: Optional[str]
The ssh key for the target SFTP server. Either ssh_key or password must be set. Can be updated, but not read.
dataset
- class tdw_catalog.dataset.ConnectedDataset(client, **kwargs)[source]
Bases:
Dataset
A
ConnectedDataset
is identical to aDataset
and inherits all of its fields, but represents aDataset
which is connected to the actual underlying data asset via a Connection. AConnectedDataset
supports queries, export, health monitoring, etc.Attributes
- exports_disabled: bool
A flag to mark if this
Dataset
may be exported. Setting this to false does not prevent querying on thisDataset
. Only relevant if theDataset
is connected to data.- warehouse: str
The underlying data warehouse where that data resides
- metrics_last_collected_at: datetime
The last time metrics were collected for this
Dataset
(virtualizedDataset
s) or the last time theDataset
was imported (ingestedDataset
s).- next_scheduled_metrics_collection_time: Optional[datetime]
If this
Dataset
has an associated connection schedule, the next time this dataset will collect metrics (virtualized Dataset) or import (ingestedDataset
s).- last_metrics_collection_failure_time: datetime
The most recent time metrics collection (virtualized
Dataset
s) or import (ingestedDataset
s) failed.None
if metrics collection has never failed.- warehouse_metadata: Optional[List[metadata_field.MetadataField]]
Harvested metadata from virtualized
Dataset
s.None
for ingestedDataset
s.
- property advanced_configuration: str
This configuration string is auto-generated during ingest, or when virtualization, inferred from the connected data. It can be modified, with caution, to alter how the
Catalog
perceives and represents the connected data.Modification of this configuration without support from ThinkData Works is not recommended.
- connect() DatasetConnector [source]
Manage all connection-related aspects of this
ConnectedDataset
.There are many methods for connecting a
Dataset
, thus a helper object is returned with various method-based workflows that aid in connecting to data.Returns
- DatasetConnector
A helper object for configuring this
Dataset
‘s connection to data.
- property connection: IngestionConnection | VirtualizationConnection
“The underlying
IngestionConnection
orVirtualizationConnection
which links thisDataset
to data
- property connection_id: str
“The ID of the underlying
IngestionConnection
orVirtualization
which links thisDataset
to data
- async export_csv(query: str | None = None) CSVExport [source]
Async function which returns the URL which can be used to stream a CSV-formatted copy of the connected data, optionally filtered by the supplied SQL-like NiQL query. Note that most standard SQL keywords are supported, but keywords which modify underlying data (e.g.
INSERT
,UPDATE
,DELETE
) are not.To refer to the current dataset in the query, include
{this}
in the query, such as:"SELECT * FROM {this}"
.Unlike
ConnectedDataset.query()
, there is no limit on exported rows, other than any imposed by the underlying warehouse.Parameters
- queryOptional[str]
A NiQL query used to filter or reshape the data before exporting
Returns
- CSVExport
An
CSVExport
object containing a signed download URL which can be used to fetch the exported data. It can be downloaded in its entirety, or streamed in chunks. ThisCSVExport
object improves the usability of the CSV data when employingpandas
, including a configuration forread_csv
which can be passed via**export
as follows:df = pd.read_csv(export.url, **export)
, ensuring that the resultantDataFrame
has the correct schema for all fields (including dates). Note: Is is recommended that export_parquet be employed for use withpandas
when supported by the underlying warehouse.
Raises
- CatalogPermissionDeniedException
If the caller is not allowed to export data
- CatalogInvalidArgumentException
If the given query is invalid
- CatalogException
If call to the
Catalog
server fails, or the export process itself fails
- async export_parquet(query: str | None = None) ParquetExport [source]
Async function which returns the URL which can be used to stream a Parquet-formatted copy of the connected data, optionally filtered by the supplied SQL-like NiQL query. Note that most standard SQL keywords are supported, but keywords which modify underlying data (e.g.
INSERT
,UPDATE
,DELETE
) are not.To refer to the current dataset in the query, include
{this}
in the query, such as:"SELECT * FROM {this}"
.Unlike
ConnectedDataset.query()
, there is no limit on exported rows, other than any imposed by the underlying warehouse.Note: Parquet export is not (yet) supported for all underlying warehouse types, but this export method should be preferred when interfacing with
pandas
whenever possible.Parameters
- queryOptional[str]
A NiQL query used to filter or reshape the data before exporting
Returns
- ParquetExport
An
ParquetExport
object containing a signed download URL which can be used to fetch the exported data. It can be downloaded in its entirety, or streamed in chunks. ThisParquetExport
object can be directly employed bypandas
as follows:df = pd.read_parquet(export.url)
. Note thatpandas
requirespyarrow
ORfastparquet
in order toread_parquet
. Note: Is is recommended that export_parquet be employed for use with pandas when supported by the underlying warehouse.
Raises
- CatalogPermissionDeniedException
If the caller is not allowed to export data
- CatalogInvalidArgumentException
If the given query is invalid, or if Parquet export is not available for this warehouse type
- CatalogException
If call to the
Catalog
server fails, or the export process itself fails
- property health_monitoring_enabled: bool
Whether or not
Catalog
platform health monitoring is enabled for thisConnectedDataset
- property metrics_collection_schedules: List[ConnectionSchedule] | None
Returns all configured schedules for metrics collection, which govern health monitoring intervals and ingestion intervals for ingested
Dataset
s
- async query(query: str | None = None) QueryCursor [source]
Async function which returns a Python DB API-style Cursor object (PEP 249), representing the results of the supplied SQL-like NiQL query executed against the connected data.
Note that NIQL supports most standard SQL keywords, but keywords which modify underlying data (e.g.
INSERT
,UPDATE
,DELETE
) may not be used.Note that the
Catalog
platform supports a global limit on results (10,000 rows) from a single query.To refer to the current dataset in the query, include
{this}
in the query, such as:"SELECT * FROM {this}"
.Parameters
- queryOptional[str]
A NiQL query used to filter or reshape the data before exporting
Returns
- QueryCursor
The query results cursor, which can be printed, converted to a
pandas
DataFrame viapd.DataFrame(res.fetchall())
, etc.
Raises
- CatalogPermissionDeniedException
If the caller is not allowed to query data
- CatalogInvalidArgumentException
If the given query is invalid
- CatalogException
If call to the
Catalog
server fails, or the export process itself fails
- async reconnect()[source]
Manually triggers a reimport of ingested data for ingested datasets, and metrics collection (health monitoring, etc.) for virtualized and ingested datasets.
Useful for forcing a metrics collection, or applying changes made to the advanced_configuration.
- refresh() ConnectedDataset [source]
Return a fresh copy of this
ConnectedDataset
, with up-to-date property values. Useful after performing an update, connection, etc.
- property set_advanced_configuration: str
This configuration string is auto-generated during ingest, or when virtualization, inferred from the connected data. It can be modified, with caution, to alter how the
Catalog
perceives and represents the connected data.Modification of this configuration without support from ThinkData Works is not recommended.
- property set_health_monitoring_enabled: bool
Whether or not
Catalog
platform health monitoring is enabled for thisConnectedDataset
- property set_metrics_collection_schedules: List[ConnectionSchedule] | None
Returns all configured schedules for metrics collection, which govern health monitoring intervals and ingestion intervals for ingested
Dataset
s
- class tdw_catalog.dataset.Dataset(client, **kwargs)[source]
Bases:
EntityBase
,_OrganizationRelation
,_SourceRelation
A
Dataset
represents a cataloged data asset within anOrganization
. It is a container for structured and custom metadata describing the asset, and can optionally be connected to the data asset via aIngestionConnection
orVirtualizationConnection
to support queries, health monitoring, etc.Attributes
- id: str
The
Dataset
’s unique ID- title: str
The title of the
Dataset
- description: Optional[str]
The full description text (supports Markdown) that helps describe this
Dataset
- uploader_id: str
- source_id: str
- source: str
- organization_id: str
The unique ID of the
Organization
which thisDataset
belongs to- organization: Organization
The
Organization
which thisDataset
belongs to- metadata_template: MetadataTemplate
The
MetadataTemplate
attached to thisDataset
, if any- data_dictionary: DataDictionary
The
DataDictionary
defined within thisDataset
, or describing the schema of the connected data if this is aConnectedDataset
- created_at: datetime
The date this
Dataset
was originally created- updated_at: datetime
The date this
Dataset
‘s metadata was last modified
- attach_template(template: MetadataTemplate)[source]
Attach a
MetadataTemplate
to thisDataset
. Values may be supplied to templated fields immediately, but the template will only be attached when class:.Dataset .save() is called.Parameters
- templateMetadataTemplate
The
MetadataTemplate
to be attached to theDataset
Returns
- Dataset
The
Dataset
with a newly attachedMetadataTemplate
- classify(topic: Topic) None [source]
Classify this
Dataset
with aTopic
, linking them semanticallyParameters
Returns
None
Raises
- connect() DatasetConnector [source]
Converts a
Dataset
into a ConnectedDataset, by accessing data via anIngestionConnection
orVirtualizationConnection
. AConnectedDataset
can represent ingested data, which is copied into theCatalog
platform, or virtualized data which is accessed remotely by the platform without being copied.There are many methods for connecting a
Dataset
, thus a helper object is returned with various method-based workflows that aid in connecting to data.Returns
- DatasetConnector
A helper object for configuring this
Dataset
‘s connection to data.
- property custom_metadata: List[MetadataField]
A list of
MetadataField
s attached to thisDataset
that are not associated with an attachedMetadataTemplate
- declassify(topic: Topic) None [source]
Remove a
Topic
classification from thisDataset
Parameters
Returns
None
Raises
- delete() None [source]
Delete this
Dataset
. TheDataset
object should not be used after this method is invoked successfully.Raises
- detach_template()[source]
Remove the attached
MetadataTemplate
from thisDataset
. Any fields from thisMetadataTemplate
will remain on theDataset
but as individualMetadataField
s. Detachment happens instantly and callingDataset
.save() is not necessary for the changes to persistParameters
None
Returns
- Dataset
The
Dataset
with no attachedMetadataTemplate
- classmethod get(client: Catalog, id: str, context_organization: organization.Organization | None = None)[source]
Retrieve a
Dataset
Parameters
- clientCatalog
- idstr
The unique ID of the
Dataset
- context_organizationOptional[Organization]
The
Organization
from which thisDataset
is being retrieved.Dataset
‘s may be accessible from multipleOrganization
‘s, but can have differing metadata within each. This context parameter is necessary to determine which metadata to load.
Returns
- Dataset
The
Dataset
associated with the given ID
Raises
- list_topics(organization_id: str | None = None, filter: Filter | None = None) List[Topic] [source]
Retrieves the list of all
Topic
s thisDataset
is currently classified under, within the givenOrganization
Parameters
- organization_idOptional[str]
An optional ID for an
Organization
other than the originalOrganization
theDataset
was created in (e.g. if theDataset
has been shared to another organization with a different set ofTopic
s)- filterOptional[Filter]
An optional
tdw_catalog.utils.Filter
to offset or limit the list ofTopic
s returned
Returns
Raises
- CatalogPermissionDeniedException
If the caller is not allowed to list
Topic
s in thisOrganization
- CatalogException
If call to the
Catalog
server fails
- refresh() Dataset [source]
Return a fresh copy of this
Dataset
, with up-to-date property values. Useful after performing an update, connection, etc.
- property templated_metadata: List[MetadataField]
A list of
MetadataField
s attached to thisDataset
that are associated with an attachedMetadataTemplate
- update_custom_metadata() MetadataEditor [source]
Provides a
MetadataEditor
which allows for the addition, removal, and alteration ofMetadataField
s on thisDataset
that are not associated with an attachedMetadataTemplate
Parameters
None
Returns
- MetadataEditor
An editor for adding, removing, and updating
MetadataField
s on theDataset
which do not belong to aMetadataTemplate
- update_templated_metadata() TemplatedMetadataEditor [source]
Provides a
TemplatedMetadataEditor
which allows for the alteration ofMetadataField
s on thisDataset
that are associated with an attachedMetadataTemplate
. This object cannot add or removeMetadataField
s, that must be done on theMetadataTemplate
directly.Parameters
None
Returns
- TemplatedMetadataEditor
An editor for updating
MetadataField
s on theDataset
that are associated with an attachedMetadataTemplate
dataset_connector
- class tdw_catalog.dataset_connector.DatasetConnector(d: Dataset | ConnectedDataset)[source]
Bases:
object
A helper object for configuring a
Dataset
‘s connection to data. Can either connect aDataset
for the first time, or reconnect an already-connectedDataset
to different data.- async ingest_from_file(local_file_path: str, connection: IngestionConnection | None = None, target_warehouse: TargetWarehouse | None = None) ConnectedDataset [source]
Async function which uploads a local file to the
Catalog
platform and ingests it, connecting thisDataset
to that ingested data.Parameters
- file_pathstr
The path to the file on disk. The file will be streamed from disk, rather than read into memory, to ensure large files upload successfully.
- connectionOptional[IngestionConnection]
Optionally specify a file upload-type IngestionConnection for use. This
IngestionConnection
must reside within the existingDataset
‘s Source, and must be of the correct type (ConnectionPortalType.IMPORT_LITE
). If not provided, the first available file upload Connection within theDataset
’s source will be used, or one will be created if none are available.- warehouseOptional[TargetWarehouse]
Optionally specify a target warehouse to ingest to. If omitted, the
TargetWarehouse
specified by theIngestionConnection
will be used, or the defaultTargetWarehouse
for theOrganization
if theIngestionConnection
does not specify a defaultTargetWarehouse
.
Returns
- ConnectedDataset
The newly connected
Dataset
, if it was not connected previously, or an updated version of the existingConnectedDataset
if it was connected previously. FurtherDataset
operations should be performed on this returned object.
Raises
- FileNotFoundError
If the specified file_path does not exist
- CatalogPermissionDeniedException
If the caller is not allowed to perform any of the steps involved in ingest data from a file
- CatalogInvalidArgumentException
If the given
IngestionConnection
cannot be used- CatalogException
If call to the
Catalog
server fails, or the ingest process itself fails
data_dictionary
- class tdw_catalog.data_dictionary.Column(key: str = None, type: ColumnType = None, name: str | None = None, description: str | None = None)[source]
Bases:
object
A single
Column
within aDataDictionary
Attributes
- keystr
The column name for this
Column
, within the actualWarehouse
where the data lives- typeColumnType
The data type for this
Column
. Available types can be found inColumnType
.- name: Optional[str]
An optional friendly name for this
Column
, which is visually used in place of thekey
throughout theCatalog
- description: Optional[str]
An optional description for this
Column
- apply_glossary_term(glossary_term: glossary_term.GlossaryTerm) None [source]
Apply a
GlossaryTerm
to thisColumn
. The containingDataDictionary
must be saved for the change to take permanent effect.Parameters
- glossary_termGlossaryTerm
The
GlossaryTerm
to classify thisColumn
with
Returns
None
Raises
- CatalogInvalidArgumentException
If the
Organization
of theGlossaryTerm
does not match theOrganization
which theDataset
was retrieved from.
- list_glossary_terms() List[glossary_term.GlossaryTerm] [source]
Return a list of
GlossaryTerm
s that have been applied to thisColumn
Parameters
None
Returns
- List[glossary_term.GlossaryTerm]
The list of
GlossaryTerm
s that have been applied to thisColumn
Raises
- CatalogPermissionDeniedException
If the caller does not have permission to list
GlossaryTerm
s on aDataset
‘sColumn
s- CatalogInternalException
If call to the
Catalog
server fails
- remove_glossary_term(glossary_term: glossary_term.GlossaryTerm) None [source]
Remove a
GlossaryTerm
from thisColumn
. The containingDataDictionary
must be saved for the change to take permanent effect.Parameters
- glossary_termGlossaryTerm
The
GlossaryTerm
to be removed from thisColumn
Returns
None
- class tdw_catalog.data_dictionary.CurrencyColumn(key: str = None, type: ColumnType = None, name: str | None = None, description: str | None = None, symbol: str | None = None)[source]
Bases:
Column
A currency-specific extension of
Column
, with an added currency symbol (such as $)Attributes
- symbolOptional[str]
An optional currency symbol (e.g.
'$'
)
- class tdw_catalog.data_dictionary.DataDictionary(dataset: Dataset, last_updated_at: datetime, version_id: str | None, columns: List[Column])[source]
Bases:
object
A
DataDictionary
describes the schema of data represented by aDataset
as a sequence ofColumn
s, each with akey
,title
,type
, and optionaldescription
.A
DataDictionary
behaves as adict
- columns can be accessed via their key as follows:data_dictionary["column_name"]
.Attributes
- last_updated_at: datetime
The last time this
DataDictionary
was updated, either by hand (forDataset
s which are not connected) or via a schedule metrics collection (forConnectedDataset
s which are)- columns: List[Column]
The list of
Column
s which make up thisDataDictionary
- columns() List[Column] [source]
Returns all
Column
s in thisDataDictionary
- has_key(key: str) bool [source]
Returns
true
if and only if aColumn
with the givenkey
exists in thisDataDictionary
- property last_updated_at: datetime
Returns the last time this
DataDictionary
was modified
- save()[source]
Update this
DataDictionary
, saving all changes to its schemaRaises
- CatalogPermissionDeniedException
If the caller is not allowed to update this
DataDictionary
- CatalogException
If call to the
Catalog
server fails
- class tdw_catalog.data_dictionary.MetadataOnlyColumn(key: str = None, type: ColumnType = None, name: str | None = None, description: str | None = None)[source]
Bases:
Column
Identical to
Column
, but within aMetadataOnlyDataDictionary
attached to aDataset
which is not connected to data. When not connected, all aspects of a data dictionary can be freely modified (includingkey
andtype
), as there is no underlying data providing/constraining the dictionary.Attributes
- keystr
The column name for this
Column
, within the actualWarehouse
where the data lives- typeColumnType
The data type for this
Column
. Available types can be found inColumnType
.- name: str
An optional friendly name for this
Column
, which is visually used in place of thekey
throughout theCatalog
- description: Optional[str]
An optional description for this
Column
- class tdw_catalog.data_dictionary.MetadataOnlyCurrencyColumn(key: str = None, type: ColumnType = None, name: str | None = None, description: str | None = None, symbol: str | None = None)[source]
Bases:
CurrencyColumn
,MetadataOnlyColumn
The
MetadataOnlyColumn
version ofCurrencyColumn
Attributes
- symbolOptional[str]
The currency symbol
- class tdw_catalog.data_dictionary.MetadataOnlyDataDictionary(dataset: Dataset, last_updated_at: datetime, version_id: str | None, columns: List[Column])[source]
Bases:
DataDictionary
A
MetadataOnlyDataDictionary
is identical to aDataDictionary
, but is attached to aDataset
which is not connected to data.Because the
Dataset
is not connected, all aspects of the dictionary can be modified freely, including column keys, types, etc. (because they are not constrained by existing underlying data).A
MetaDataOnlyDataDictionary
behaves as adict
- columns can be accessed (and overwritten) via their key as follows:data_dictionary["column_name"] = ...
.Attributes
- last_updated_at: datetime
The last time this
DataDictionary
was updated, either by hand (forDataset
s which are not connected) or via a schedule metrics collection (forConnectedDataset
s which are)- columns: List[MetadataOnlyColumn]
The list of
MetadataOnlyColumn
s which make up thisDataDictionary
- add(col: Column, index: int | None = None) MetadataOnlyDataDictionary [source]
Appends a specific
Column
to thisMetadataOnlyDataDictionary
, or inserts it at a specificindex
.Parameters
Returns
- MetadataOnlyDataDictionary
A reference to itself for method chaining
- clear() MetadataOnlyDataDictionary [source]
Removes all
Column
s from thisMetadataOnlyDataDictionary
Returns
- MetadataOnlyDataDictionary
A reference to itself for method chaining
- columns() List[MetadataOnlyColumn] [source]
Returns all
Column
s in thisMetadataOnlyDataDictionary
- remove(key: str) MetadataOnlyDataDictionary [source]
Removes a specific
Column
from thisMetadataOnlyDataDictionary
by keyParameters
- keystr
The key of the
Column
Returns
- MetadataOnlyDataDictionary
A reference to itself for method chaining
errors
- exception tdw_catalog.errors.CatalogAbortedException(*args, message, meta={})[source]
Bases:
CatalogException
The operation was aborted, typically due to a concurrency issue like sequencer check failures, transaction aborts, etc.
- exception tdw_catalog.errors.CatalogAlreadyExistsException(*args, message, meta={})[source]
Bases:
CatalogException
An attempt to create an entity failed because one already exists.
- exception tdw_catalog.errors.CatalogBadRouteException(*args, message, meta={})[source]
Bases:
CatalogException
The requested URL path wasn’t routable to a known method. This is returned by generated server code and should not be returned by application code (use “not_found” or “unimplemented” instead).
- exception tdw_catalog.errors.CatalogCanceledException(*args, message, meta={})[source]
Bases:
CatalogException
The operation was cancelled
- exception tdw_catalog.errors.CatalogDataLossException(*args, message, meta={})[source]
Bases:
CatalogException
The operation resulted in unrecoverable data loss or corruption.
- exception tdw_catalog.errors.CatalogDeadlineExceededException(*args, message, meta={})[source]
Bases:
CatalogException
Operation expired before completion. For operations that change the state of the system, this error may be returned even if the operation has completed successfully (timeout).
- exception tdw_catalog.errors.CatalogException(*args, code=Errors.Unknown, message='', meta={})[source]
Bases:
TwirpServerException
The most generic Catalog platform error
- exception tdw_catalog.errors.CatalogFailedPreconditionException(*args, message, meta={})[source]
Bases:
CatalogException
The operation was rejected because the system is not in a state required for the operation’s execution. For example, doing an rmdir operation on a directory that is non-empty, or on a non-directory object, or when having conflicting read-modify-write on the same resource.
- exception tdw_catalog.errors.CatalogInternalException(*args, message, meta={})[source]
Bases:
CatalogException
When some invariants expected by the underlying system have been broken. In other words, something bad happened in the library or backend service. Twirp specific issues like wire and serialization problems are also reported as “internal” errors.
- exception tdw_catalog.errors.CatalogInvalidArgumentException(*args, message, meta={})[source]
Bases:
CatalogException
The client specified an invalid argument. This indicates arguments that are invalid regardless of the state of the system (i.e. a malformed file name, required argument, number out of range, etc.).
- exception tdw_catalog.errors.CatalogMalformedException(*args, message, meta={})[source]
Bases:
CatalogException
The client sent a message which could not be decoded. This may mean that the message was encoded improperly or that the client and server have incompatible message definitions.
- exception tdw_catalog.errors.CatalogNoErrorException(*args, message, meta={})[source]
Bases:
CatalogException
- exception tdw_catalog.errors.CatalogNotFoundException(*args, message, meta={})[source]
Bases:
CatalogException
Some requested entity was not found.
- exception tdw_catalog.errors.CatalogOutOfRangeException(*args, message, meta={})[source]
Bases:
CatalogException
The operation was attempted past the valid range. For example, seeking or reading past end of a paginated collection. Unlike “invalid_argument”, this error indicates a problem that may be fixed if the system state changes (i.e. adding more items to the collection).
- exception tdw_catalog.errors.CatalogPermissionDeniedException(*args, message, meta={})[source]
Bases:
CatalogException
The caller does not have permission to execute the specified operation. It must not be used if the caller cannot be identified (use “unauthenticated” instead).
- exception tdw_catalog.errors.CatalogResourceExhaustedException(*args, message, meta={})[source]
Bases:
CatalogException
Some resource has been exhausted or rate-limited, perhaps a per-user quota, or perhaps the entire file system is out of space.
- exception tdw_catalog.errors.CatalogUnauthenticatedException(*args, message, meta={})[source]
Bases:
CatalogException
The request does not have valid authentication credentials for the operation.
Bases:
CatalogException
The service is currently unavailable. This is most likely a transient condition and may be corrected by retrying with a backoff.
- exception tdw_catalog.errors.CatalogUnimplementedException(*args, message, meta={})[source]
Bases:
CatalogException
The operation is not implemented or not supported/enabled in this service.
- exception tdw_catalog.errors.CatalogUnknownException(*args, message, meta={})[source]
Bases:
CatalogException
An unknown error occurred. For example, this can be used when handling errors raised by APIs that do not return any error information.
export
- class tdw_catalog.export.CSVExport[source]
Bases:
_Export
CSVExport
represents a signed download URL pointing to the CSV-formatted result of aDataset
export_csv()
operation, alongside metadata concerning the exported data.This class is deliberately formatted for use with pandas’
read_csv
function, as follows:e1 = await dataset.export_csv()
anddf = pd.read_csv(e1.url, **e1)
Attributes
- query: str
The query statement which was used to create the
Export
- created_at: datetime
The time this
Export
was originally created- started_at: datetime
The time this
Export
was started- finished_at: datetime
The time this
Export
was completed- url: str
The CSV-formatted export results can be downloaded via this signed URL
- dtypeDict[str, Type]
Metadata describing the schema of the exported data
- parse_dates: List[str]
A list of columns within
dtype
that should be interpreted as dates- true_valuesList[str]
A list of values to interpret as “truthy”
- false_valuesList[str]
A list of values to interpret as “falsey”
- compressionOptional[str]
Indicates the compression format of the data, if any
glossary_term
- class tdw_catalog.glossary_term.GlossaryTerm(client, **kwargs)[source]
Bases:
EntityBase
,_OrganizationRelation
GlossaryTerm
s are used to categorize and classify columns withinDataset
sAttributes
- idstr
GlossaryTerm
‘s unique id- organization_idstr
The unique ID of the
Organization
to which thisGlossaryTerm
belongs- user_idstr
The unique ID of the
User
who created thisGlossaryTerm
- titlestr
The title for this
GlossaryTerm
- description: Optional[str]
An Optional description for this
GlossaryTerm
- created_atdatetime
The datetime at which this
GlossaryTerm
was created- updated_atdatetime
The datetime at which this
GlossaryTerm
was last updated
- delete() None [source]
Delete this
GlossaryTerm
. ThisGlossaryTerm
object should not be used after delete() has successfully returnedParameters
None
Returns
None
Raises
- CatalogPermissionDeniedException
If the caller is not allowed to delete this
GlossaryTerm
- CatalogNotFoundException
If the
GlossaryTerm
being deleted does not exist- CatalogException
If call to the
Catalog
server fails
- save() None [source]
Update this
GlossaryTerm
, saving any changes to its titleParameters
None
Returns
None
Raises
- CatalogPermissionDeniedException
If the caller is not allowed to update this
GlossaryTerm
, or if the givenGlossaryTerm
ID does not exist- CatalogException
If call to the
Catalog
server fails
list_datasets
- class tdw_catalog.list_datasets.DatasetAlias(alias_key: str, alias_values: List[str])[source]
Bases:
object
Used to sort the results of
list_datasets
by specific aliases
- class tdw_catalog.list_datasets.Filter(limit: int = None, offset: int = None, keywords: List[str] | None = None, dataset_ids: List[str] | None = None, dataset_aliases: List[DatasetAlias] | None = None, reference_ids: List[str] | None = None, sources: List[Source] | None = None, topics: List[Topic] | None = None, creators: List[OrganizationMember] | None = None, states: List[ImportState] | None = None, warehouses: List[Warehouse] | None = None, timestamp_range: TimestampRange | None = None, sort: Sort | None = None)[source]
Bases:
LegacyFilter
ListOrganizationDatasetsFilter
filters the results fromlist_datasets
onOrganization
.Attributes
- keywordsOptional[List[str]]
Filters results according to the specified keyword(s) (fuzzy matching is supported)
- dataset_idsOptional[List[str]]
Filters results to the list of given
Dataset
id(s)- datset_aliasesOptional[List[DatasetAlias]]
Filters results to the list of given
Dataset
alias(es)- sourcesOptional[List[Source]]
Filters results to the list of given
Source
(s)- topicsOptional[List[Topic]]
Filters results to the list of given
Topic
(s)- creatorsOptional[List[OrganizationMember]]
Filters results to the list of given
OrganizationMember
(s), who created the returnedDataset
s- stateOptional[List[ImportState]]
Filters results to the list of given
ImportState
s. Note that virtualized datasets will always be categorized asIMPORTED
.- warehouses: Optional[List[Warehouse]]
Filters results to the list of given
Warehouse
s- timestamp_rangeOptional[TimestampRange]
Filters results to the within the given
TimestampRange
- sortOptional[Sort]
Sorts filtered results according to the provided
Sort
structure
- class tdw_catalog.list_datasets.Sort(field: SortableField, order: FilterSortOrder | None = FilterSortOrder.ASC)[source]
Bases:
object
Used to sort the results of
list_datasets
onOrganization
.
- enum tdw_catalog.list_datasets.SortableField(value)[source]
Bases:
StrEnum
The different fields which
list_datasets
onOrganization
can be sorted by- Member Type:
str
Valid values are as follows:
- TITLE = <SortableField.TITLE: 'title'>
- CREATED_AT = <SortableField.CREATED_AT: 'created_at'>
- IMPORTED_AT = <SortableField.IMPORTED_AT: 'imported_at'>
- UPDATED_AT = <SortableField.UPDATED_AT: 'updated_at'>
- STATE = <SortableField.STATE: 'reference_state'>
- NEXT_INGEST = <SortableField.NEXT_INGEST: 'reference_next_ingest'>
- FAILED_AT = <SortableField.FAILED_AT: 'reference_failed_at'>
- SOURCE_NAME = <SortableField.SOURCE_NAME: 'source_label'>
- enum tdw_catalog.list_datasets.TimestampField(value)[source]
Bases:
IntEnum
The different possible fields that can be used to construct a
TimestampRange
filter forlist_datasets
onOrganization
- Member Type:
int
Valid values are as follows:
- CREATED_AT = <TimestampField.CREATED_AT: 0>
- UPDATED_AT = <TimestampField.UPDATED_AT: 1>
- IMPORTED_AT = <TimestampField.IMPORTED_AT: 2>
- NEXT_INGEST = <TimestampField.NEXT_INGEST: 3>
- FAILED_AT = <TimestampField.FAILED_AT: 5>
- class tdw_catalog.list_datasets.TimestampRange(filter_by: TimestampField, start_time: datetime | None, end_time: datetime | None)[source]
Bases:
object
Used to construct a temporal filter for
list_datasets
onOrganization
, where a filter specifies aTimestampField
and a time range
organization
- class tdw_catalog.organization.Organization(client, **kwargs)[source]
Bases:
EntityBase
Organization
s are the primary entrypoints to a DataCatalog
, containing and linking togetherOrganizationMember
s,Team
s,Dataset
s, etc..Attributes
- titlestr
The name of the
Organization
- created_atdatetime
The datetime at which this
Organization
was created- updated_atdatetime
The datetime at which this
Organization
was last updated
- create_credential() CredentialFactory [source]
Provides a
CredentialFactory
which is capable of creatingCredential
s within thisOrganization
.Parameters
Returns
- CredentialFactory
A factory for creating specific types of
Credential
s
- create_dataset(source: Source, title: str, description: str | None = None) Dataset [source]
Creates a new
Dataset
within thisOrganization
. TheDataset
will have a title and (optionally) a description, and must be associated with aSource
. TheDataset
will otherwise be empty and can be subsequently populated with metadata and data.Parameters
Returns
- Dataset
The newly created
Dataset
Raises
- CatalogPermissionDeniedException
If the caller is not allowed to create
Dataset
s in thisOrganization
- CatalogInvalidArgumentException
If title is an empty string, or if the
Source
belongs to a differentOrganization
than this one- CatalogException
If call to the
Catalog
server fails
- create_glossary_term(title: str, description: str | None = None) GlossaryTerm [source]
Create a
GlossaryTerm
within thisOrganization
Parameters
- title: str
The name of the new
GlossaryTerm
- description: Optional[str]
The description of the new
GlossaryTerm
Returns
- GlossaryTerm
The newly created
GlossaryTerm
Raises
- CatalogPermissionDeniedException
If the caller is not allowed to create
GlossaryTerm
s in thisOrganization
- CatalogInvalidArgumentException
If title is an empty string
- CatalogAlreadyExistsException
If a
GlossaryTerm
with the provided title already exists in thisOrganization
- CatalogException
If call to the
Catalog
server fails
- create_lineage(upstream_dataset: Dataset, downstream_dataset: Dataset, label: str, description: str | None = None, column_lineage: List[tuple[Union[str, List[str]], Union[str, List[str]]]] = []) DatasetLineageRelationship [source]
Create a
DatasetLineageRelationship
within thisOrganization
. Each relationship describes a single source and destinationDataset
(a single “edge” in the lineage graph), with optional column-level lineage.Branching (or many-to-many) relationships can be modelled by decomposing them into their individual edges.
Parameters
- upstream_datasetDataset
The source dataset involved in this
DatasetLineageRelationship
- downstream_datasetDataset
The destination dataset involved in this
DatasetLineageRelationship
- labelstr
A label describing this
DatasetLineageRelationship
- descriptionOptional[str]
An optional description providing further details about this
DatasetLineageRelationship
- column_lineageList[tuple[Union[str, List[str]],Union[str,List[str]]]]
An optional list of column-level associations between the two
Dataset
s, specified as tuples. Each tuple is a single column-level relationship between a list of upstream columns and a list of downstream columns. This argument defaults to the emptyList
if not supplied. Example:[("address", ["street_number","street_name","city"])]
Returns
- DatasetLineageRelationship
The newly created
DatasetLineageRelationship
Raises
- CatalogInvalidArgumentException
If any specified column names within provided column lineage do not actually exist in the provided
Dataset
s- CatalogPermissionDeniedException
If the caller is not allowed to define lineage in this
Organization
, or if they do not have access to one of the involvedDataset
s- CatalogException
If call to the
Catalog
server fails
- create_metadata_template(title: str, description: str | None = None) MetadataTemplateCreationBuilder [source]
Provides a
MetadataTemplateCreationBuilder
which is capable of creatingMetadataTemplate
s within thisOrganization
.Parameters
- titlestr
The title for the
MetadataTemplate
- descriptionOptional[str]
An optional description for the
MetadataTemplate
Returns
- MetadataTemplateCreationBuilder
A factory for creating new
MetadataTemplate
s
- create_or_replace_lineage(upstream_dataset: Dataset, downstream_dataset: Dataset, label: str, description: str | None = None, column_lineage: List[tuple[List[str], List[str]]] = []) DatasetLineageRelationship [source]
Create a
DatasetLineageRelationship
within thisOrganization
. Each relationship describes a single source and destinationDataset
(a single “edge” in the lineage graph), with optional column-level lineage.Branching (or many-to-many) relationships can be modelled by decomposing them into their individual edges.
If no relationships between the given
Dataset
s exist, one will be created. Unlikecreate_lineage
, pre-existing relationships between the givenDataset
s will be cleared and replaced by this one, facilitating easy one-way syncs from an external lineage metdata source and theCatalog
platform.Parameters
- upstream_datasetDataset
The source dataset involved in this
DatasetLineageRelationship
- downstream_datasetDataset
The destination dataset involved in this
DatasetLineageRelationship
- labelstr
A label describing this
DatasetLineageRelationship
- descriptionOptional[str]
An optional description providing further details about this
DatasetLineageRelationship
- column_lineageList[tuple[Union[str, List[str]],Union[str,List[str]]]]
An optional list of column-level associations between the two
Dataset
s, specified as tuples. Each tuple is a single column-level relationship between a list of upstream columns and a list of downstream columns. This argument defaults to the emptyList
if not supplied. Example:[("address", ["street_number","street_name","city"])]
Returns
- DatasetLineageRelationship
The newly created
DatasetLineageRelationship
Raises
- CatalogInvalidArgumentException
If any specified column names within provided column lineage do not actually exist in the provided
Dataset
s- CatalogPermissionDeniedException
If the caller is not allowed to define lineage in this
Organization
, or if they do not have access to one of the involvedDataset
s- CatalogException
If call to the
Catalog
server fails
- create_source(label: str, description: str | None = None) Source [source]
Create a
Source
within thisOrganization
Parameters
Returns
- Source:
The newly created
Source
Raises
- CatalogInternalException
If call to the
Catalog
server fails
- create_team(title: str) Team [source]
Create a
Team
within thisOrganization
Parameters
- title: str
The name of the new
Team
Returns
- Team
The newly created
Team
Raises
- CatalogPermissionDeniedException
If the caller is not allowed to create
Team
s in thisOrganization
- CatalogException
If call to the
Catalog
server fails
- create_topic(title: str) Topic [source]
Create a
Topic
within thisOrganization
Parameters
- title: str
The name of the new
Topic
Returns
- Topic
The newly created
Topic
Raises
- CatalogPermissionDeniedException
If the caller is not allowed to create
Topic
s in thisOrganization
- CatalogException
If call to the
Catalog
server fails
- delete() None [source]
Delete this
Organization
. ThisOrganization
object should not be used after delete() has successfully returned, as theCatalog
organization it represents will no longer exist.Parameters
None
Returns
None
Raises
- CatalogPermissionDeniedException
If the caller is not allowed to delete this
Organization
- CatalogException
If call to the
Catalog
server fails
- get_connection(id: str) IngestionConnection | VirtualizationConnection [source]
Retrieve the given
IngestionConnection
orVirtualizationConnection
from thisOrganization
Parameters
- team_idstr
The unique ID of the Connection
Returns
- Union[IngestionConnection,VirtualizationConnection]
The Connection with the given ID
Raises
- CatalogPermissionDeniedException
If the caller is not allowed to retrieve Connections from this
Organization
- CatalogNotFoundException
If the given Connection ID does not exist
- CatalogException
If call to the
Catalog
server fails
- get_credential(credential_id: str) Credential [source]
Retrieve a
Credential
belonging to thisOrganization
Parameters
- credential_idstr
The unique ID of the
Credential
Returns
- Credential
The
Credential
associated with the given ID
Raises
- CatalogPermissionDeniedException
If the caller is not allowed to retrieve
Credential
s- CatalogNotFoundException
If the given
Credential
ID does not exist- CatalogException
If call to the
Catalog
server fails
- get_dataset(id: str) Dataset | ConnectedDataset [source]
Retrieve the given
Dataset
from thisOrganization
Parameters
- idstr
The unique ID of the
Dataset
Returns
- Dataset
The
Dataset
with the given ID
Raises
- get_glossary_term(id: str) GlossaryTerm [source]
Retrieve the given
GlossaryTerm
from thisOrganization
Parameters
- idstr
The unique ID of the
GlossaryTerm
Returns
- GlossaryTerm
The
GlossaryTerm
with the given ID
Raises
- CatalogPermissionDeniedException
If the caller is not allowed to retrieve
GlossaryTerm
s from thisOrganization
, or if the givenGlossaryTerm
ID does not exist- CatalogException
If call to the
Catalog
server fails
- get_lineage(id: str) DatasetLineageRelationship [source]
Retrieve the given
DatasetLineageRelationship
from thisOrganization
Parameters
- idstr
The unique ID of the
DatasetLineageRelationship
Returns
- DatasetLineageRelationship
The
DatasetLineageRelationship
with the given ID
Raises
- CatalogPermissionDeniedException
If the caller is not allowed to retrieve
DatasetLineageRelationship
s from thisOrganization
- CatalogInvalidArgumentException
If the given
DatasetLineageRelationship
ID does not exist- CatalogException
If call to the
Catalog
server fails
- get_member(user_id: str) OrganizationMember [source]
Retrieve the a specific member (
User
) of thisOrganization
Parameters
- user_idstr
The unique
User
ID of theOrganizationMember
Returns
- OrganizationMember
The
OrganizationMember
with the givenUser
ID
Raises
- CatalogPermissionDeniedException
If the caller is not allowed to fetch
OrganizationMember
s- CatalogInvalidArgumentException
If the given
User
ID does not exist or is not a member of thisOrganization
- CatalogException
If call to the
Catalog
server fails
- get_metadata_template(id: str) MetadataTemplate [source]
Retrieve a
MetadataTemplate
belonging to thisOrganization
Parameters
- idstr
The unique ID of the
MetadataTemplate
Returns
- MetadataTemplate
The
MetadataTemplate
associated with the given ID
Raises
- CatalogPermissionDeniedException
If the caller is not allowed to retrieve
MetadataTemplate
s- CatalogNotFoundException
If the given
MetadataTemplate
ID does not exist- CatalogException
If call to the
Catalog
server fails
- get_source(id: str) Source [source]
Retrieve a
Source
belonging to thisOrganization
Parameters
- idstr
The unique ID of the
Source
Returns
- Source
The
Source
associated with the given ID
Raises
- get_team(team_id: str) Team [source]
Retrieve the given
Team
from thisOrganization
Parameters
- team_idstr
The unique ID of the
Team
Returns
- Team
The
Team
with the given ID
Raises
- CatalogPermissionDeniedException
If the caller is not allowed to retrieve
Team
s from thisOrganization
- CatalogInvalidArgumentException
If the given
Team
ID does not exist- CatalogException
If call to the
Catalog
server fails
- get_topic(id: str) Topic [source]
Retrieve the given
Topic
from thisOrganization
Parameters
- idstr
The unique ID of the
Topic
Returns
- Topic
The
Topic
with the given ID
Raises
- CatalogPermissionDeniedException
If the caller is not allowed to retrieve
Topic
s from thisOrganization
, or if the givenTopic
ID does not exist- CatalogException
If call to the
Catalog
server fails
- invite_member(user_id: str, roles: OrganizationMemberRoles = None) OrganizationMember [source]
Invite the given
User
to be anOrganizationMember
of thisOrganization
Parameters
Returns
- OrganizationMember
The newly created
OrganizationMember
Raises
- CatalogPermissionDeniedException
If the caller is not allowed to invite
OrganizationMember
s- CatalogAlreadyExistsException
If the caller is inviting a
User
who is already anOrganizationMember
of thisOrganization
- CatalogInvalidArgumentException
If the given
User
ID does not exist- CatalogException
If call to the
Catalog
server fails
- invite_members(emails: List[str], invite_message: str | None = '', raise_on_failure: bool | None = False, roles: OrganizationMemberRoles | None = None) InviteMembersResponse [source]
Invite the given
User
(s) to becomeOrganizationMember
s of thisOrganization
. If a given email does not correspond to an existingUser
, an invitation to theCatalog
platform will be sent via email.Parameters
- emails
The list of email addresses of the invitees.
- invite_messageOptional[str]
The message to send the users when sending the invitation
- raise_on_failureOptional[bool]
Whether to raise an exception on a failure of any one invite
- rolesOptional[OrganizationMemberRoles]
The roles the new members will take when invited
Returns
- InviteMembersResponse
This contains a summary of the successful and failed invitations.
Raises
- CatalogPermissionDeniedException
If the caller is not allowed to invite
OrganizationMember
s- CatalogException
If call to the
Catalog
server fails
- list_connections(filter: ListConnectionsFilter | None = None) List[IngestionConnection | VirtualizationConnection] [source]
List all
VirtualizationConnection
andIngestionConnection
s in thisOrganization
Parameters
- filterOptional[Filter]
An optional Filter on the returned Connection list, useful for pagination of results. Note that the organization_id property will be set automatically to this
Organization
.
Returns
- List[Union[IngestionConnection,VirtualizationConnection]]
The list of Connections in this
Organization
Raises
- CatalogPermissionDeniedException
If the caller is not allowed to list Connections in this
Organization
- CatalogException
If call to the
Catalog
server fails
- list_credentials(filter: LegacyFilter = None) List[Credential] [source]
List
Credential
s which belong to the givenOrganization
Returns
- List[Credential]
Credential
s created under thisOrganization
Raises
- CatalogPermissionDeniedException
If the caller is not allowed to list
Credential
s- CatalogException
If call to the
Catalog
server fails
- list_datasets(filter: Filter | None = None) Iterator[Dataset] [source]
Retrieve the list of
Dataset
s which belong to theOrganization
. The maximum number of results is limited, and must be paginated via thefilter
to obtain additional results.Parameters
- filterOptional[list_datasets.Filter]
An optional filter on the returned
Dataset
s (None by default)
Returns
- Iterator[Dataset]
An Iterator of
Dataset
s belonging to thisOrganization
, which are lazily fetched as the Iterator is iterated.
Raises
- list_external_warehouses() List[ExternalWarehouse] [source]
Retrieve the list of known
ExternalWarehouse
s available to thisOrganization
Parameters
None
Returns
- List[ExternalWarehouse]
ExternalWarehouse
s that are available to thisOrganization
Raises
- CatalogPermissionDeniedException
If the caller is not allowed to list
ExternalWarehouse
s (the caller must be anOrganization
admin, or haveDataset
creation privileges)- CatalogException
If call to the
Catalog
server fails
- list_glossary_terms(organization_ids: List[str] | None = None, filter: ListGlossaryTermsFilter | None = None) List[GlossaryTerm] [source]
List all
GlossaryTerm
s in thisOrganization
Parameters
- organization_ids: Optional[List[str]]
An optional list of
Organization
ID’s to list GlossaryTerms from multipleOrganization
s- filterOptional[ListGlossaryTermsFilter]
An optional
ListGlossaryTermsFilter
on the returnedGlossaryTerm
list, useful for pagination of results
Returns
- List[GlossaryTerm]
The list of
GlossaryTerm
s in thisOrganization
Raises
- CatalogPermissionDeniedException
If the caller is not allowed to list
GlossaryTerm
s in thisOrganization
- CatalogException
If call to the
Catalog
server fails
- list_members(filter: LegacyFilter | None = None) List[OrganizationMember] [source]
Retrieve all
OrganizationMember
s of thisOrganization
Parameters
None
Returns
- List[OrganizationMember]
The
OrganizationMember
s which are a member of thisOrganization
Raises
- CatalogPermissionDeniedException
If the caller is not allowed to list
OrganizationMember
s- CatalogException
If call to the
Catalog
server fails
- list_metadata_templates() List[MetadataTemplate] [source]
List all
MetadataTemplate
s which belong to the givenOrganization
Returns
- List[MetadataTemplate]
MetadataTemplate
s created under thisOrganization
Raises
- CatalogPermissionDeniedException
If the caller is not allowed to list
MetadataTemplate
s- CatalogException
If call to the
Catalog
server fails
- list_sources(filter: ListSourcesFilter | None = None) List[Source] [source]
List Sources which belong to the given
Organization
Parameters
- filter:SourcesFilter
The
SourceFilter
to be used when performing the search
Returns
- List[Source]
Source
s created under thisOrganization
Raises
- CatalogException
If call to the
Catalog
server fails
- list_target_warehouses() List[TargetWarehouse] [source]
Retrieve the list of known
TargetWarehouse
s available to thisOrganization
Parameters
None
Returns
- List[TargetWarehouse]
TargetWarehouse
s that are available to thisOrganization
Raises
- CatalogPermissionDeniedException
If the caller is not allowed to list
TargetWarehouse
s (the caller must be anOrganization
admin, or haveDataset
creation privileges)- CatalogException
If call to the
Catalog
server fails
- list_teams(organization_ids=None, filter: LegacyFilter | None = None) List[Team] [source]
List all Teams in this
Organization
Parameters
- filterOptional[LegacyFilter]
An optional filter on the returned
Team
list, useful for pagination of results
Returns
- List[Team]
The list of
Team
s in thisOrganization
Raises
- CatalogPermissionDeniedException
If the caller is not allowed to list
Team
s in thisOrganization
- CatalogException
If call to the
Catalog
server fails
- list_topics(filter: LegacyFilter = None) List[Topic] [source]
List all
Topic
s in thisOrganization
Parameters
- filterOptional[LegacyFilter]
An optional filter on the returned
Topic
list, useful for pagination of results
Returns
- List[Topic]
The list of
Topic
s in thisOrganization
Raises
- CatalogPermissionDeniedException
If the caller is not allowed to list
Topic
s in thisOrganization
- CatalogException
If call to the
Catalog
server fails
- save() None [source]
Update this
Organization
, saving any changes to its titleParameters
None
Returns
None
Raises
- CatalogPermissionDeniedException
If the caller is not allowed to update this
Organization
- CatalogException
If call to the
Catalog
server fails
organization_member
- class tdw_catalog.organization_member.OrganizationMember(client, **kwargs)[source]
Bases:
User
,_OrganizationRelation
An
OrganizationMember
reflects a relationship betweenUser
andOrganization
, where theUser
has been invited to theOrganization
and been granted specific privileges within theOrganization
.Attributes
- user_idstr
The unique user ID of the
OrganizationMember
- organizationorganization.Organization
The
Organization
object that relates to the organization_id of this model- organization_idstr
The unique ID of the
Organization
to which thisOrganizationMember
belongs- roles: OrganizationMemberRoles
The roles this Member has within their
Organization
- created_atdatetime
The datetime at which this
OrganizationMember
was added to theOrganization
- updated_atdatetime
The datetime at which this
OrganizationMember
was last updated
- delete() None [source]
Remove this
OrganizationMember
from theOrganization
. ThisOrganizationMember
object should not be used after delete() returns successfully.Parameters
None
Returns
None
Raises
- CatalogPermissionDeniedException
If the caller is not allowed to delete this
OrganizationMember
, or if the caller is attempting to delete themselves- CatalogException
If call to the
Catalog
server fails
- classmethod get(client: Catalog, organization_id: str, id: str)[source]
Retrieve an
OrganizationMember
belonging to thisOrganization
Parameters
- clientCatalog
The
Catalog
client of theOrganization
containing theOrganizationMember
- organization_idstr
The unique ID of the
Organization
- idstr
The unique ID of the
OrganizationMember
Returns
- OrganizationMember
The
OrganizationMember
associated with the given ID
Raises
- CatalogInternalException
If call to the
Catalog
server fails- CatalogNotFoundException
If no
OrganizationMember
is found matching the provided ID- CatalogPermissionDeniedException
If the caller is not allowed to retrieve this
OrganizationMember
- get_teams(filter: LegacyFilter | None = None) List[Team] [source]
Retrieve the
Team
s to which thisOrganizationMember
belongsParameters
- filterOptional[LegacyFilter]
An optional filter on the returned
Team
list, useful for pagination of results
Returns
- List[Team]
The list of
Team
s to which thisOrganizationMember
belongs
Raises
- CatalogPermissionDeniedException
If the caller is not allowed to retrieve
Team
s from thisOrganization
- CatalogException
If call to the
Catalog
server fails
- save() None [source]
Update this
OrganizationMember
, saving any changes to its rolesParameters
None
Returns
None
Raises
- CatalogPermissionDeniedException
If the caller is not allowed to update this
OrganizationMember
- CatalogAlreadyExistsException
If the caller is attempting to invite a member which is already a member of the
Organization
- CatalogException
If call to the
Catalog
server fails
- class tdw_catalog.organization_member.OrganizationMemberRoles(role_data_uploader: bool = False, role_data_viewer: bool = False, role_data_editor: bool = False, role_member_manager: bool = False, role_organization_manager: bool = False, role_admin: bool = False, role_topic_manager: bool = False, role_field_template_manager: bool = False)[source]
Bases:
object
OrganizationMemberRoles
defines the roles which anOrganizationMember
has within anOrganization
Attributes
- role_data_uploaderbool
Whether this
OrganizationMember
is allowed to upload to theOrganization
- role_data_viewerbool
Whether this
OrganizationMember
is allowed to view Datasets within theOrganization
- role_data_editorbool
Whether this
OrganizationMember
is allowed to modify Datasets within theOrganization
- role_member_managerbool
Whether this
OrganizationMember
is allowed to manage members within theOrganization
- role_organization_managerbool
Whether this
OrganizationMember
is allowed to manage theOrganization
- role_adminbool
Whether this
OrganizationMember
is anOrganization
administrator- role_topic_managerbool
Whether this
OrganizationMember
is allowed to manageTopic
s within theOrganization
- role_field_template_managerbool
Whether this
OrganizationMember
is allowed to manageMetadataTemplate
s
organization_utils
- class tdw_catalog.organization_utils.InviteMembersResponse(failed_invitations: List[InviteMembersResponseFailedInvitation], successful_invitations: List[OrganizationMember])[source]
Bases:
object
InviteMembersResponse
contains the successfully invited members and summarizes any failed invitations.Attributes
- failed_invitations: List[InviteMembersResponseFailedInvitee]
List of email addresses and error message summaries of the failed invitations.
- successful_invitationList[organization_member.OrganizationMember]
List of members which were successfully invited to the
Organization
.
- class tdw_catalog.organization_utils.InviteMembersResponseFailedInvitation(email: str, error_message: str)[source]
Bases:
object
InviteMembersResponseFailedInvitation
is a container for a single failed invitation, providing information about why that invitation failed to send.Attributes
- email: str
The email address of the invitee
- error_messagestr
A message indicating why the invitation failed to send.
query
- class tdw_catalog.query.QueryCursor(res: Dict[str, any])[source]
Bases:
object
QueryCursor
is a Python DB API-style Cursor object (PEP 249) for query results from theCatalog
.Attributes
- arraysize: number
Read/write attribute that controls the number of rows returned by fetchmany(). The default value is 1 which means a single row would be fetched per call.
- description: List[tuple]
Read-only attribute that provides the column names of the last query. To remain compatible with the Python DB API, it returns a 7-tuple for each column where the last five items of each tuple are None.
- fetchall() List[tuple] [source]
Return all (remaining) rows of a query result as a list. Return an empty list if no rows are available.
- fetchmany(size=None) List[tuple] [source]
Return the next set of rows of a query result as a list. Return an empty list if no more rows are available.
The number of rows to fetch per call is specified by the size parameter. If size is not given, arraysize determines the number of rows to be fetched. If fewer than size rows are available, as many rows as are available are returned.
Note there are performance considerations involved with the size parameter. For optimal performance, it is usually best to use the arraysize attribute. If the size parameter is used, then it is best for it to retain the same value from one
fetchmany()
call to the next.
source
- class tdw_catalog.source.Source(client, **kwargs)[source]
Bases:
EntityBase
,_OrganizationRelation
A
Source
is used to semantically group a set of relatedDataset
s.User
s are free to label aSource
in a descriptive way to best understand the meaning behind this grouping.Attributes
- idstr
Source’s unique id
- organizationOrganization
The
Organization`associated with this :class:
.Source`. AnOrganization
ororganization_id
can be provided but not both.- organization_idstr
The unique ID of the
Organization
to which thisSource
belongs- user_idstr
The unique user ID of the
OrganizationMember
who created thisSource
- labelstr
A descriptive label for this
Source
- descriptionOptional[str] = None
An optional extended description for this
Source
- created_atdatetime
The datetime at which this
Source
was created- updated_atdatetime
The datetime at which this
Source
was last updated
- create_ingestion_connection(label: str, portal: ConnectionPortalType, url: str | None = None, description: str | None = None, warehouse: Warehouse | None = None, credential: Credential | None = None, ingest_schedules: List[ConnectionSchedule] | None = None) IngestionConnection [source]
Create an
IngestionConnection
within thisSource
Parameters
- labelstr
The descriptive label for this
IngestionConnection
- portalConnectionPortalType
The method of data access employed by this
IngestionConnection
- urlOptional[str]
A canonical URL that points to the location of data resources within the portal
- descriptionOptional[str] = None
An optional extended description for this
IngestionConnection
- warehouseOptional[Warehouse]
Dataset
s created using thisIngestionConnection
will ingest to thisWarehouse
by default (can be overriden at ingest time).- credentialOptional[Credential]
The
Credential
associated with thisIngestionConnection
.- ingest_schedulesOptional[List[ConnectionSchedule]]
Optional
ConnectionSchedule
s which, when specified, indicate the frequency with which to reingest ingested data. Specific Datasets using thisIngestionConnection
may override this set of Schedules.
Returns
- IngestionConnection
The newly created IngestionConnection
Raises
- CatalogPermissionDeniedException
If the caller is not allowed to create
IngestionConnection
s in thisOrganization
- CatalogException
If call to the
Catalog
server fails
- delete() None [source]
Delete this
Source
. ThisSource
object should not be used after delete() has successfully returnedRaises
- classmethod get(client, organization_id: str, id: str)[source]
Retrieve a
Source
belonging to thisOrganization
Parameters
Returns
- Source
The
Source
associated with the given ID
Raises
- list_connections(filter: ListConnectionsFilter | None = None) List[IngestionConnection | VirtualizationConnection] [source]
List all
IngestionConnection
andVirtualizationConnection
s belonging to thisSource
Parameters
- filterOptional[ListConnectionsFilter]
An optional filter on the returned Connection list, useful for pagination of results. Note that the organization_id and source_ids properties will be set automatically to this
Organization
and Source.
Returns
- List[Connection]
The list of Connections in this
Source
Raises
- CatalogPermissionDeniedException
If the caller is not allowed to list Connections in this
Organization
- CatalogException
If call to the
Catalog
server fails
team
- class tdw_catalog.team.Team(client, **kwargs)[source]
Bases:
EntityBase
,_OrganizationRelation
Teams are sets of OrganizationMembers, with which Datasets can be shared.
Attributes
- idstr
The unique ID of this Team
- organizationorganization.Organization
The
Organization`that relates to the `organization_id
on the model- organization_idstr
The unique ID of the
Organization
to which this Team belongs- titlestr
The name of this Team
- created_atdatetime
The datetime at which this Team was created
- updated_atdatetime
The datetime at which this Team was last updated
- add_member(user_id: str, permission: TeamMemberPermissionLevel) TeamMember [source]
Add a
User
to theTeam
as a TeamMember. TheUser
in question must already be a member of the containing Organization.Parameters
- user_idstr
The unique
User
ID of the invitee
Returns
- TeamMember
The newly created TeamMember
Raises
- CatalogPermissionDeniedException
If the caller is not allowed to invite Team members
- CatalogAlreadyExistsException
If the caller is inviting a
User
who is already a TeamMember of this Team- CatalogInvalidArgumentException
If the given
User
ID does not exist- CatalogException
If call to the
Catalog
server fails
- delete() None [source]
Delete this Team. This Team object should not be used after delete() has successfully returned
Parameters
None
Returns
None
Raises
- CatalogPermissionDeniedException
If the caller is not allowed to delete this Team
- CatalogException
If call to the
Catalog
server fails
- classmethod get(client: Catalog, organization_id: str, id: str)[source]
Retrieve an
Team
belonging to thisOrganization
Parameters
- clientCatalog
A
Catalog
client- organization_idstr
The unique ID of the
Organization
- idstr
The unique ID of the
Team
Returns
- Team
The
Team
associated with the given ID
Raises
- get_member(user_id: str) TeamMember [source]
Retrieve a specific member (User) of this Team
Parameters
- user_idstr
The unique
User
ID of theTeamMember
Returns
- TeamMember
The
TeamMember
with the givenUser
ID
Raises
- list_members(filter: LegacyFilter | None = None) List[TeamMember] [source]
Retrieve all TeamMembers of this Team
Parameters
None
Returns
- list[TeamMembers]
The
TeamMember
s which are a member of this Team
Raises
- CatalogPermissionDeniedException
If the caller is not allowed to list TeamMembers
- CatalogException
If call to the
Catalog
server fails
team_member
- class tdw_catalog.team_member.TeamMember(client, **kwargs)[source]
Bases:
User
A
TeamMember
reflects a relationship betweenUser
andTeam
, where theUser
has been invited to theTeam
and been granted specific privileges within theTeam
.Attributes
- teamteam.Team
The
Team
that relates to the team_id of the model- team_idstr
The unique ID of the
Team
to which thisTeamMember
belongs- permission: TeamMemberPermissionLevel
- created_atdatetime
The timestamp this
TeamMember
was added to theTeam
- updated_atdatetime
The timestamp this
TeamMember
permission was changed
- delete() None [source]
Remove this
TeamMember
from theTeam
. ThisTeamMember
object should not be used after delete() returns successfully.Parameters
None
Returns
None
Raises
- CatalogPermissionDeniedException
If the caller is not allowed to delete this
TeamMember
, or if the caller is attempting to delete themselves- CatalogException
If call to the
Catalog
server fails
- classmethod get(client: Catalog, team_id: str, id: str)[source]
Retrieve an
TeamMember
Parameters
- clientCatalog
The
Catalog
client- team_idstr
The unique ID of the
Team
- idstr
The unique ID of the
TeamMember
Returns
- TeamMember
The
TeamMember
associated with the given ID
Raises
- CatalogInternalException
If call to the
Catalog
server fails- CatalogNotFoundException
If no
TeamMember
is found matching the provided ID- CatalogPermissionDeniedException
If the caller is not allowed to retrieve this
TeamMember
- save() None [source]
Update this
TeamMember
, saving any changes to its permission levelParameters
None
Returns
None
Raises
- CatalogPermissionDeniedException
If the caller is not allowed to update this
TeamMember
- CatalogInvalidArgumentException
If the caller supplies an invalid permission level before saving this
TeamMember
- CatalogException
If call to the
Catalog
server fails
topic
- class tdw_catalog.topic.Topic(client, **kwargs)[source]
Bases:
EntityBase
,_OrganizationRelation
Topic
s are used to classify Datasets within anOrganization
. Classification can be used as a means to apply a grouping label to one or moreDataset
s.Attributes
- idstr
Topic’s unique id
- organization_idstr
The unique ID of the
Organization
to which thisTopic
belongs- created_bystr
The unique user ID of the user who created this
Topic
- titlestr
The title for this
Topic
- created_atdatetime
The datetime at which this
Topic
was created- updated_atdatetime
The datetime at which this
Topic
was last updated
- delete() None [source]
Delete this
Topic
. ThisTopic
object should not be used after delete() has successfully returnedParameters
None
Returns
None
Raises
user
- class tdw_catalog.user.User(client, **kwargs)[source]
Bases:
EntityBase
A
User
is a registered user of the ThinkDataCatalog
. CurrentlyUser
s can only be created through theCatalog
user interface, and cannot be created through the API.User
s can be added as members ofOrganization
s andTeam
s, and haveDataset
s shared with them.Attributes
warehouse
- class tdw_catalog.warehouse.ExternalWarehouse(client, **kwargs)[source]
Bases:
Warehouse
An
ExternalWarehouse
is aWarehouse
which is configured for Data Virtualization. New data cannot be written to anExternalWarehouse
, but virtualizedDataset
s can be created which read from it.Attributes
- database_name: Optional[str]
If set, the database to virtualize tables and views from
- schema: Optional[str]
If set, the schema to virtualize tables and views from
- class tdw_catalog.warehouse.TargetWarehouse(client, **kwargs)[source]
Bases:
Warehouse
A
TargetWarehouse
is aWarehouse
which is configured for data ingestion. New data can be written to aTargetWarehouse
.
- class tdw_catalog.warehouse.Warehouse(client, **kwargs)[source]
Bases:
EntityBase
A
Warehouse
is a place whereDatasets
are stored. Currently,Warehouse
s are configured at the deployment-level and cannot be modified through this SDK.Attributes
- name: str
The unique name of the warehouse in the system. This name will never change for the life of the
Warehouse
.- display_name: str
The descriptive name of the
Warehouse
.- warehouse_type: str
The type of Warehouse this represents.
- external: Optional[bool]
True if this Warehouse is virtualized within the
Catalog
utils
- enum tdw_catalog.utils.ColumnType(value)[source]
Bases:
StrEnum
The different possible data types for
Column
s within aDataDictionary
- Member Type:
str
Valid values are as follows:
- BOOLEAN = <ColumnType.BOOLEAN: 'boolean'>
- DATE = <ColumnType.DATE: 'date'>
- DATETIME = <ColumnType.DATETIME: 'datetime'>
- INTEGER = <ColumnType.INTEGER: 'integer'>
- DECIMAL = <ColumnType.DECIMAL: 'decimal'>
- PERCENT = <ColumnType.PERCENT: 'percent'>
- CURRENCY = <ColumnType.CURRENCY: 'currency'>
- STRING = <ColumnType.STRING: 'string'>
- TEXT = <ColumnType.TEXT: 'text'>
- GEOMETRY = <ColumnType.GEOMETRY: 'geometry'>
- GEOJSON = <ColumnType.GEOJSON: 'geojson'>
- enum tdw_catalog.utils.ConnectionPortalType(value)[source]
Bases:
StrEnum
- Member Type:
str
Valid values are as follows:
- GS = <ConnectionPortalType.GS: 'Gs'>
- S3 = <ConnectionPortalType.S3: 'S3'>
- UNITY = <ConnectionPortalType.UNITY: 'Unity'>
- FTP = <ConnectionPortalType.FTP: 'Ftp'>
- SFTP = <ConnectionPortalType.SFTP: 'Sftp'>
- EXTERNAL = <ConnectionPortalType.EXTERNAL: 'External'>
- NULL = <ConnectionPortalType.NULL: 'Null'>
- IMPORT_LITE = <ConnectionPortalType.IMPORT_LITE: 'ImportLite'>
- HTTP = <ConnectionPortalType.HTTP: 'Http'>
- CATALOG = <ConnectionPortalType.CATALOG: 'Namara'>
- class tdw_catalog.utils.CurrencyFieldValue(value: float, currency: str)[source]
Bases:
object
CurrencyFieldValue
models the value of a currency fieldAttributes
- valuefloat
The currency value
- currencystr
The specific currency to which the value belongs
- class tdw_catalog.utils.Filter(limit: int = None, offset: int = None)[source]
Bases:
LegacyFilter
Filter
describes the ways in which results should be filtered and/or paginated. It is serialized in a new way vsLegacyFilter
Attributes
- limitint, optional
Limits the number of results. Useful for pagination. (None by default)
- offsetint, optional
Offsets the result list by the given number of results. Useful for pagination. (None by default)
- class tdw_catalog.utils.FilterSort(field: str, order: FilterSortOrder = FilterSortOrder.ASC)[source]
Bases:
object
FilterSort
describes a desired sort field and order for results.Attributes
- fieldstr
The field to sort by
- orderFilterSortOrder, optional
The order to sort in (FilterSortOrder.ASC by default)
- enum tdw_catalog.utils.FilterSortOrder(value)[source]
Bases:
Enum
Valid values are as follows:
- ASC = <FilterSortOrder.ASC: 1>
- DESC = <FilterSortOrder.DESC: 2>
- enum tdw_catalog.utils.ImportState(value)[source]
Bases:
StrEnum
The different possible states an imported dataset might occupy. Virtualized datasets will always show state
IMPORTED
.- Member Type:
str
Valid values are as follows:
- IMPORTED = <ImportState.IMPORTED: 'imported'>
- IMPORTING = <ImportState.IMPORTING: 'importing'>
- QUEUED = <ImportState.QUEUED: 'queued'>
- FAILED = <ImportState.FAILED: 'failed'>
- class tdw_catalog.utils.LegacyFilter(limit: int = None, offset: int = None)[source]
Bases:
object
LegacyFilter
describes the ways in which results should be filtered and/or paginatedAttributes
- limitint, optional
Limits the number of results. Useful for pagination. (None by default)
- offsetint, optional
Offsets the result list by the given number of results. Useful for pagination. (None by default)
- class tdw_catalog.utils.ListConnectionsFilter(limit: int = None, offset: int = None, organization_id: str | None = None, source_ids: List[str] | None = None, portals: List[ConnectionPortalType] | None = None)[source]
Bases:
LegacyFilter
ListConnectionsFilter
filters results according to Connection fieldsAttributes
- organization_idOptional[str]
Filters results by organization_id
- source_idsOptional[List[str]]
Filters results to the given source_id(s)
- portalsOptional[List[ConnectionPortalType]]
Filters results to the given
ConnectionPortalType
(s)
- class tdw_catalog.utils.ListGlossaryTermsFilter(limit: int = None, offset: int = None, glossary_term_ids: List[str] | None = None)[source]
Bases:
Filter
ListGlossaryTermsFilter
filters results according toGlossaryTerm
idsAttributes
- glossary_term_idsOptional[List[str]]
Filters results to the given glossary_term_id(s)
- class tdw_catalog.utils.ListOrganizationsFilter(limit: int = None, offset: int = None, organization_ids: List[str] | None = None)[source]
Bases:
LegacyFilter
ListOrganizationsFilter
filtersOrganization
results according to a set of provided idsAttributes
- organization_idsstr[], optional
Filters results according to a set of provided ids
- class tdw_catalog.utils.ListSourcesFilter(limit: int = None, offset: int = None, labels: str | None = None)[source]
Bases:
LegacyFilter
ListSourcesFilter
filters results according toSource
fieldsAttributes
- labelsOptional[str]
Filters results by label. This will match label substrings.
- enum tdw_catalog.utils.MetadataFieldType(value)[source]
Bases:
IntEnum
The different possible data types for values stored in MetadataFields and default values stored in MetadataTemplateFields
- Member Type:
int
Valid values are as follows:
- FT_STRING = <MetadataFieldType.FT_STRING: 0>
- FT_INTEGER = <MetadataFieldType.FT_INTEGER: 1>
- FT_DECIMAL = <MetadataFieldType.FT_DECIMAL: 2>
- FT_DATE = <MetadataFieldType.FT_DATE: 3>
- FT_DATETIME = <MetadataFieldType.FT_DATETIME: 4>
- FT_DATASET = <MetadataFieldType.FT_DATASET: 5>
- FT_URL = <MetadataFieldType.FT_URL: 6>
- FT_USER = <MetadataFieldType.FT_USER: 7>
- FT_ATTACHMENT = <MetadataFieldType.FT_ATTACHMENT: 8>
- FT_LIST = <MetadataFieldType.FT_LIST: 9>
- FT_CURRENCY = <MetadataFieldType.FT_CURRENCY: 10>
- FT_TEAM = <MetadataFieldType.FT_TEAM: 11>
- FT_ALIAS = <MetadataFieldType.FT_ALIAS: 12>
- class tdw_catalog.utils.QueryFilter(limit: int = None, offset: int = None, sort: FilterSort = None, query: str | None = None)[source]
Bases:
SortableFilter
QueryFilter
filters results according to a NiQL queryAttributes
- querystr, optional
Filters results according to a NiQL query
- class tdw_catalog.utils.SortableFilter(limit: int = None, offset: int = None, sort: FilterSort = None)[source]
Bases:
LegacyFilter
SortableFilter
describes the ways in which results should be filtered, paginated and/or sorted.Attributes
- limitint, optional
Limits the number of results. Useful for pagination. (None by default)
- offsetint, optional
Offsets the result list by the given number of results. Useful for pagination. (None by default)
- sortFilterSort, optional
Specifies a desired sort field and order for results (None by default).