tdw_catalog
- class tdw_catalog.Catalog(*args, **kwargs)[source]
A
Catalogis the primary client object for a ThinkData CatalogParameters
- api_keyOptional[str])
An optional api key for the Catalog platform. This parameter must be supplied, containing your personal API key for the Catalog platform, and can only be omitted when supplied via an environment variable CATALOG_API_KEY instead.
- auth_urlOptional[str])
An optional auth url for the Catalog platform. This parameter must only be supplied when connecting to a dedicated Catalog deployment, and can be populated via an envrionment variable CATALOG_AUTH_URL instead. Defaults to the auth url for the ThinkData Works SaaS Catalog platform (https://account.ee.namara.io).
- api_urlOptional[str])
An optional API url for the Catalog platform. This parameter must only be supplied when connecting to a dedicated Catalog deployment, and can be populated via an envrionment variable CATALOG_API_URL instead. Defaults to the API url for the ThinkData Works SaaS Catalog platform (https://api.ee.namara.io).
- create_organization(title: str) organization.Organization[source]
Creates an
OrganizationParameters
- titlestr
The title for the new
Organization
Returns
- Organization
The created
Organization
Raises
- CatalogException
If there is an issue communicating with the
Catalogserver, or an issue with the server itself
- get_organization(id: str) organization.Organization[source]
Retrieve a specific
OrganizationParameters
- idstr
The UUID of the
Organization
Returns
- Organization
The
Organizationwhich has the provided id, if it exists and the caller is a member of it
Raises
- CatalogPermissionDeniedException
If the caller is not a member of the given
Organization, or if it does not exist- CatalogException
If there is an issue communicating with the
Catalogserver, or an issue with the server itself
- list_organizations(filter: ListOrganizationsFilter | None = None) List[organization.Organization][source]
Retrieve the list of
Organizations to which the caller belongs.Parameters
- filterListOrganizationsFilter
An optional filter on the returned
Organizations (None by default).
Returns
- list[Organization]
The list of
Organizations to which the caller belongs, ordered by title (ascending).
Raises
- CatalogException
If there is an issue communicating with the
Catalogserver, or an issue with the server itself
connection
- class tdw_catalog.connection.ConnectionSchedule(interval: HourlyInterval | DailyInterval | WeeklyInterval | MonthlyInterval | YearlyInterval, timezone: str)[source]
Bases:
objectA
ConnectionScheduledescribes the frequency with which to reingest ingested data, or re-analyze virtualized dataAttributes
- interval: HourlyInterval | DailyInterval | WeeklyInterval | MonthlyInterval | YearlyInterval
The interval that this schedule represents
- timezone: str
The timezone in which to interpret times in the interval
- class tdw_catalog.connection.DailyInterval(minute: int, hour: int)[source]
Bases:
HourlyIntervalA DailyInterval interval causes a
ConnectionScheduleto execute at a specific minute and hour each dayAttributes
- minuteint
The minute of the hour to execute at, between 0 and 59
- hourint
The hour of the day to execute at, between 0 and 23
- class tdw_catalog.connection.HourlyInterval(minute: int)[source]
Bases:
objectAn hourly interval causes a
ConnectionScheduleto execute at a specific minute every hour.Attributes
- minuteint
The minute of the hour to execute at, between 0 and 59
- class tdw_catalog.connection.IngestionConnection(client, **kwargs)[source]
Bases:
_ConnectionIngestionConnections are used to attach ingested data to aDataset, describing the mechanism and necessary credentials for accessing said data. Data is ingested via anIngestionConnection: pulled from an uploaded file, or a remote location such as a cloud storage bucket.Attributes
- idstr
IngestionConnection‘s unique id- source_idstr
The unique ID of the
Sourceto which thisIngestionConnectionbelongs- sourceSource
The
Sourceassociated with thisIngestionConnection. ASourceorsource_idcan be provided but not both.- user_idstr
The unique
UserID of the user who created thisIngestionConnection- labelstr
The descriptive label for this
IngestionConnection- descriptionOptional[str] = None
An optional extended description for this
IngestionConnection- portalConnectionPortalType
The method of data access employed by this
IngestionConnection- urlstr
A canonical URL that points to the location of data resources within the portal
- warehouseOptional[str]
Datasets created using thisIngestionConnectionwill ingest to thisWarehouseby default (can be overriden at ingest time).- credential_idOptional[str]
The
CredentialID that should be used along with the portal to accessDatasets when ingesting.- credentialOptional[credential.Credential]
The
Credentialassociated with thisIngestionConnection. Omitted when virtualizing. ACredentialorcredential_idcan be provided but not both.- ingest_schedulesOptional[List[ConnectionSchedule]]
Optional
ConnectionSchedules which, when specified, indicate the frequency with which to reingest ingested data. SpecificDatasets using thisIngestionConnectionmay override this set ofConnectionSchedules.- disabledOptional[bool]
When true, disables the schedule on this
IngestionConnection. TheIngestionConnectionitself can still be used for manual ingestion or data virtualization.- created_atdatetime
The datetime at which this
IngestionConnectionwas created- updated_atdatetime
The datetime at which this
IngestionConnectionwas last updated
- class tdw_catalog.connection.MonthlyInterval(minute: int, hour: int, dayOfMonth: int)[source]
Bases:
DailyIntervalA MonthlyInterval interval causes a
ConnectionScheduleto execute on a specific day of the month, at a specific minute+hour, every month.Attributes
- minuteint
The minute of the hour to execute at, between 0 and 59
- hourint
The hour of the day to execute at, between 0 and 23
- dayOfMonthint
The day of the week to execute at, beginning on Sunday, between 1 and 31, or “-1” for the last day of each month
- class tdw_catalog.connection.VirtualizationConnection(client, **kwargs)[source]
Bases:
_ConnectionVirtualizationConnections are used to attach virtualized data to aDataset, describing the mechanism and necessary credentials for accessing said data. Data is accessed from a remote location without being copied into the platform.Attributes
- idstr
IngestionConnection‘s unique id- source_idstr
The unique ID of the
Sourceto which thisIngestionConnectionbelongs- sourceSource
The
Sourceassociated with thisIngestionConnection. ASourceorsource_idcan be provided but not both.- user_idstr
The unique
UserID of the user who created thisIngestionConnection- labelstr
The descriptive label for this
IngestionConnection- descriptionOptional[str] = None
An optional extended description for this
IngestionConnection- portalConnectionPortalType
The method of data access employed by this
IngestionConnection- urlstr
A canonical URL that points to the location of data resources within the portal
- warehouseOptional[str]
Virtualized datasets created using this
IngestionConnectionwill always access data from thisWarehouse(must be suplied for virtualization). Non-virtualized datasets created using thisIngestionConnectionwill ingest to thisWarehouseby default (can be overriden at ingest time).- credential_idOptional[str]
The
CredentialID that should be used along with the portal to accessDatasets when ingesting. Omitted when virtualizing.- credentialOptional[credential.Credential]
The
Credentialassociated with thisIngestionConnection. Omitted when virtualizing. ACredentialorcredential_idcan be provided but not both.- default_schema: str
The schema to search for tables and views
- metrics_collection_schedulesOptional[List[ConnectionSchedule]]
Optional
ConnectionSchedules which, when specified, indicate the frequency with which to re-analyze virtualized data. SpecificDatasets using thisVirtualizationConnectionmay override this set ofConnectionSchedules.- disabledOptional[bool]
When true, disables the schedule on this
IngestionConnection. TheIngestionConnectionitself can still be used for manual ingestion or data virtualization.- created_atdatetime
The datetime at which this
IngestionConnectionwas created- updated_atdatetime
The datetime at which this
IngestionConnectionwas last updated
- class tdw_catalog.connection.WeeklyInterval(minute: int, hour: int, dayOfWeek: int)[source]
Bases:
DailyIntervalA WeelyInterval interval causes a
ConnectionScheduleto execute on a specific day of the week, at a specific minute+hour, every week.Attributes
- minuteint
The minute of the hour to execute at, between 0 and 59
- hourint
The hour of the day to execute at, between 0 and 23
- dayOfWeek: int
The day of the week beginning on Sunday, between 0 and 6
- class tdw_catalog.connection.YearlyInterval(minute: int, hour: int, dayOfMonth: int, month: int)[source]
Bases:
MonthlyIntervalA MonthlyInterval interval causes a
ConnectionScheduleto execute on a specific day of a specific month, at a specific minute+hour, every year.Attributes
- minuteint
The minute of the hour to execute at, between 0 and 59
- hourint
The hour of the day to execute at, between 0 and 23
- dayOfMonthint
The day of the week to execute at, beginning on Sunday, between 1 and 31, or “-1” for the last day of each month
- monthint
The month of the year to execute at, between 1 and 12
credential
- class tdw_catalog.credential.CatalogCredential(client, **kwargs)[source]
Bases:
CredentialA
CatalogCredentialpermits aSourceto access datasets which exist on another ThinkData WorksCatalogserver.Attributes
- catalog_api_keystr
The API key for the target
Catalog. Can be updated, but not read.
- class tdw_catalog.credential.Credential(client, **kwargs)[source]
Bases:
EntityBase,_OrganizationRelationCredentials are used in conjunction withSources to ingest data intoDatasetsParameters
- idstr
Credential‘s unique id- organization_idstr
The unique ID of the
Organizationto which thisCredentialbelongs- user_idstr
The unique user ID of the user who created this
Credential- namestr
A name for this
Credential- descriptionstr
The Optional description of this
Credential- created_atdatetime
The datetime at which this
Credentialwas created- updated_atdatetime
The datetime at which this
Credentialwas last updated
- delete() None[source]
Delete this
Credentialfrom the user. ThisCredentialobject should not be used after delete() returns successfully.Parameters
None
Raises
- CatalogPermissionDeniedException
If the caller is not allowed to update this
Credential- CatalogInvalidArgumentException
If the given
Credentialdoes not exist- CatalogException
If call to the
Catalogserver fails
- classmethod get(client: Catalog, organization_id: str, id: str)[source]
Retrieve a
Credentialbelonging to anOrganizationParameters
- clientcatalog.Client
The
Catalogclient to use to get theCredential- organization_idstr
The unique ID of the
Organization- idstr
The unique ID of the
Credential
Returns
- Credential
The
Credentialassociated with the given ID
- save() None[source]
Update this
Credential, saving any changes to its name, description or type-specific fields.Parameters
None
Returns
None
Raises
- CatalogPermissionDeniedException
If the caller is not allowed to update this
Credential- CatalogException
If call to the
Catalogserver fails
- class tdw_catalog.credential.CredentialFactory(client: Catalog, organization_id: str)[source]
Bases:
objectA
CredentialFactorycreates specific types ofCredentials within a specificOrganization- catalog_credential(name: str, description: str | None, catalog_api_key: str) CatalogCredential[source]
Constructs a
CatalogCredentialParameters
- namestr
A name for this
Credential- descriptionOptional[str]
The Optional description of this
Credential- catalog_api_keystr
The API key for the target
Catalog
Returns
- CatalogCredential
The created
CatalogCredential
Raises
- CatalogPermissionDeniedException
If the caller is not allowed to create
Credentials- CatalogInvalidArgumentException
If one or more of the given credential parameters are invalid
- CatalogException
If call to the
Catalogserver fails
- ftp_credential(name: str, description: str | None, username: str, password: str) FTPCredential[source]
Constructs an
FTPCredentialParameters
- namestr
A name for this
Credential- descriptionOptional[str]
The Optional description of this
Credential- username: str
The username for the target FTP server
- password: str
The password for the target FTP server
Returns
- FTPCredential
The created FTPCredential
Raises
- CatalogPermissionDeniedException
If the caller is not allowed to create
Credentials- CatalogInvalidArgumentException
If one or more of the given credential parameters are invalid
- CatalogException
If call to the
Catalogserver fails
- google_storage_credential(name: str, description: str | None, region: str, project: str, client_secrets: str) GoogleStorageCredential[source]
Constructs a
GoogleStorageCredentialParameters
- namestr
A name for this
Credential- descriptionOptional[str]
The Optional description of this
Credential- projectstr
The name of the Google Cloud project in which the bucket can be found
- regionstr
The Google Cloud region in which the bucket can be found (e.g. us-central1)
- client_secretsstr
The client secrets for the Google Storage bucket. Can be updated, but not read.
Returns
- GoogleStorageCredential
The created
GoogleStorageCredential
Raises
- CatalogPermissionDeniedException
If the caller is not allowed to create
Credentials- CatalogInvalidArgumentException
If one or more of the given credential parameters are invalid
- CatalogException
If call to the
Catalogserver fails
- s3_credential(name: str, description: str | None, region: str, access_key_id: str, secret_access_key: str) S3Credential[source]
Constructs as
S3CredentialParameters
- namestr
A name for this
Credential- descriptionOptional[str]
The Optional description of this
Credential- regionstr
The AWS S3 region in which the bucket resides
- access_key_idstr
The AWS Access Key for the S3 bucket. Can be updated but not read.
- secret_access_keystr
The AWS Secret Access Key for the S3 bucket. Can be updated but not read.
Returns
- S3Credential
The created
S3Credential
Raises
- CatalogPermissionDeniedException
If the caller is not allowed to create
Credentials- CatalogInvalidArgumentException
If one or more of the given credential parameters are invalid
- CatalogException
If call to the
Catalogserver fails
- sftp_with_key_credential(name: str, description: str | None, username: str, ssh_key: str) SFTPCredential[source]
Constructs a key-based
SFTPCredentialParameters
- namestr
A name for this
Credential- descriptionOptional[str]
The Optional description of this
Credential- username: str
The username for the target SFTP server
- ssh_key: str
The ssh_key for the target SFTP server
Returns
- SFTPCredential
The created
SFTPCredential
Raises
- CatalogPermissionDeniedException
If the caller is not allowed to create
Credentials- CatalogInvalidArgumentException
If one or more of the given credential parameters are invalid
- CatalogException
If call to the
Catalogserver fails
- sftp_with_password_credential(name: str, description: str | None, username: str, password: str) SFTPCredential[source]
Constructs a password-based
SFTPCredentialParameters
- namestr
A name for this
Credential- descriptionOptional[str]
The Optional description of this
Credential- username: str
The username for the target SFTP server
- password: str
The password for the target SFTP server
Returns
- SFTPCredential
The created
SFTPCredential
Raises
- CatalogPermissionDeniedException
If the caller is not allowed to create
Credentials- CatalogInvalidArgumentException
If one or more of the given credential parameters are invalid
- CatalogException
If call to the
Catalogserver fails
- class tdw_catalog.credential.FTPCredential(client, **kwargs)[source]
Bases:
CredentialAn
FTPCredentialpermits aSourceto access data stored on an FTP server.Attributes
- username: str
The username for the target FTP server
- password: str
The password for the target FTP server. Can be updated, but not read.
- class tdw_catalog.credential.GoogleStorageCredential(client, **kwargs)[source]
Bases:
CredentialA
GoogleStorageCredentialpermits aSourceto access data stored in a Google Storage (GS) bucket.Attributes
- projectstr
The name of the Google Cloud project in which the bucket can be found
- regionstr
The Google Cloud region in which the bucket can be found (e.g. us-central1)
- client_secretsstr
The client secrets for the Google Storage bucket. Can be updated, but not read.
- class tdw_catalog.credential.S3Credential(client, **kwargs)[source]
Bases:
CredentialAn
S3Credentialpermits aSourceto access data stored in an AWS S3 (or other S3-compatible) bucket.Attributes
- regionstr
The AWS S3 region in which the bucket resides
- access_key_idstr
The AWS Access Key for the S3 bucket. Can be updated but not read.
- secret_access_keystr
The AWS Secret Access Key for the S3 bucket. Can be updated but not read.
- class tdw_catalog.credential.SFTPCredential(client, **kwargs)[source]
Bases:
CredentialAn
SFTPCredentialpermits aSourceto access data stored on an SFTP server.Attributes
- username: str
The username for the target FTP server
- password: str
The password for the target FTP server. Can be updated, but not read.
- ssh_key: Optional[str]
The ssh key for the target SFTP server. Either ssh_key or password must be set. Can be updated, but not read.
dataset
- class tdw_catalog.dataset.ConnectedDataset(client, **kwargs)[source]
Bases:
DatasetA
ConnectedDatasetis identical to aDatasetand inherits all of its fields, but represents aDatasetwhich is connected to the actual underlying data asset via a Connection. AConnectedDatasetsupports queries, export, health monitoring, etc.Attributes
- exports_disabled: bool
A flag to mark if this
Datasetmay be exported. Setting this to false does not prevent querying on thisDataset. Only relevant if theDatasetis connected to data.- warehouse: str
The underlying data warehouse where that data resides
- metrics_last_collected_at: datetime
The last time metrics were collected for this
Dataset(virtualizedDatasets) or the last time theDatasetwas imported (ingestedDatasets).- next_scheduled_metrics_collection_time: Optional[datetime]
If this
Datasethas an associated connection schedule, the next time this dataset will collect metrics (virtualized Dataset) or import (ingestedDatasets).- last_metrics_collection_failure_time: datetime
The most recent time metrics collection (virtualized
Datasets) or import (ingestedDatasets) failed.Noneif metrics collection has never failed.- warehouse_metadata: Optional[List[metadata_field.MetadataField]]
Harvested metadata from virtualized
Datasets.Nonefor ingestedDatasets.
- property advanced_configuration: str
This configuration string is auto-generated during ingest, or when virtualization, inferred from the connected data. It can be modified, with caution, to alter how the
Catalogperceives and represents the connected data.Modification of this configuration without support from ThinkData Works is not recommended.
- connect() DatasetConnector[source]
Manage all connection-related aspects of this
ConnectedDataset.There are many methods for connecting a
Dataset, thus a helper object is returned with various method-based workflows that aid in connecting to data.Returns
- DatasetConnector
A helper object for configuring this
Dataset‘s connection to data.
- property connection: IngestionConnection | VirtualizationConnection
“The underlying
IngestionConnectionorVirtualizationConnectionwhich links thisDatasetto data
- property connection_id: str
“The ID of the underlying
IngestionConnectionorVirtualizationwhich links thisDatasetto data
- async export_csv(query: str | None = None) CSVExport[source]
Async function which returns the URL which can be used to stream a CSV-formatted copy of the connected data, optionally filtered by the supplied SQL-like NiQL query. Note that most standard SQL keywords are supported, but keywords which modify underlying data (e.g.
INSERT,UPDATE,DELETE) are not.To refer to the current dataset in the query, include
{this}in the query, such as:"SELECT * FROM {this}".Unlike
ConnectedDataset.query(), there is no limit on exported rows, other than any imposed by the underlying warehouse.Parameters
- queryOptional[str]
A NiQL query used to filter or reshape the data before exporting
Returns
- CSVExport
An
CSVExportobject containing a signed download URL which can be used to fetch the exported data. It can be downloaded in its entirety, or streamed in chunks. ThisCSVExportobject improves the usability of the CSV data when employingpandas, including a configuration forread_csvwhich can be passed via**exportas follows:df = pd.read_csv(export.url, **export), ensuring that the resultantDataFramehas the correct schema for all fields (including dates). Note: Is is recommended that export_parquet be employed for use withpandaswhen supported by the underlying warehouse.
Raises
- CatalogPermissionDeniedException
If the caller is not allowed to export data
- CatalogInvalidArgumentException
If the given query is invalid
- CatalogException
If call to the
Catalogserver fails, or the export process itself fails
- async export_parquet(query: str | None = None) ParquetExport[source]
Async function which returns the URL which can be used to stream a Parquet-formatted copy of the connected data, optionally filtered by the supplied SQL-like NiQL query. Note that most standard SQL keywords are supported, but keywords which modify underlying data (e.g.
INSERT,UPDATE,DELETE) are not.To refer to the current dataset in the query, include
{this}in the query, such as:"SELECT * FROM {this}".Unlike
ConnectedDataset.query(), there is no limit on exported rows, other than any imposed by the underlying warehouse.Note: Parquet export is not (yet) supported for all underlying warehouse types, but this export method should be preferred when interfacing with
pandaswhenever possible.Parameters
- queryOptional[str]
A NiQL query used to filter or reshape the data before exporting
Returns
- ParquetExport
An
ParquetExportobject containing a signed download URL which can be used to fetch the exported data. It can be downloaded in its entirety, or streamed in chunks. ThisParquetExportobject can be directly employed bypandasas follows:df = pd.read_parquet(export.url). Note thatpandasrequirespyarrowORfastparquetin order toread_parquet. Note: Is is recommended that export_parquet be employed for use with pandas when supported by the underlying warehouse.
Raises
- CatalogPermissionDeniedException
If the caller is not allowed to export data
- CatalogInvalidArgumentException
If the given query is invalid, or if Parquet export is not available for this warehouse type
- CatalogException
If call to the
Catalogserver fails, or the export process itself fails
- property health_monitoring_enabled: bool
Whether or not
Catalogplatform health monitoring is enabled for thisConnectedDataset
- property metrics_collection_schedules: List[ConnectionSchedule] | None
Returns all configured schedules for metrics collection, which govern health monitoring intervals and ingestion intervals for ingested
Datasets
- async query(query: str | None = None) QueryCursor[source]
Async function which returns a Python DB API-style Cursor object (PEP 249), representing the results of the supplied SQL-like NiQL query executed against the connected data.
Note that NIQL supports most standard SQL keywords, but keywords which modify underlying data (e.g.
INSERT,UPDATE,DELETE) may not be used.Note that the
Catalogplatform supports a global limit on results (10,000 rows) from a single query.To refer to the current dataset in the query, include
{this}in the query, such as:"SELECT * FROM {this}".Parameters
- queryOptional[str]
A NiQL query used to filter or reshape the data before exporting
Returns
- QueryCursor
The query results cursor, which can be printed, converted to a
pandasDataFrame viapd.DataFrame(res.fetchall()), etc.
Raises
- CatalogPermissionDeniedException
If the caller is not allowed to query data
- CatalogInvalidArgumentException
If the given query is invalid
- CatalogException
If call to the
Catalogserver fails, or the export process itself fails
- async reconnect()[source]
Manually triggers a reimport of ingested data for ingested datasets, and metrics collection (health monitoring, etc.) for virtualized and ingested datasets.
Useful for forcing a metrics collection, or applying changes made to the advanced_configuration.
- refresh() ConnectedDataset[source]
Return a fresh copy of this
ConnectedDataset, with up-to-date property values. Useful after performing an update, connection, etc.
- property set_advanced_configuration: str
This configuration string is auto-generated during ingest, or when virtualization, inferred from the connected data. It can be modified, with caution, to alter how the
Catalogperceives and represents the connected data.Modification of this configuration without support from ThinkData Works is not recommended.
- property set_health_monitoring_enabled: bool
Whether or not
Catalogplatform health monitoring is enabled for thisConnectedDataset
- property set_metrics_collection_schedules: List[ConnectionSchedule] | None
Returns all configured schedules for metrics collection, which govern health monitoring intervals and ingestion intervals for ingested
Datasets
- class tdw_catalog.dataset.Dataset(client, **kwargs)[source]
Bases:
EntityBase,_OrganizationRelation,_SourceRelationA
Datasetrepresents a cataloged data asset within anOrganization. It is a container for structured and custom metadata describing the asset, and can optionally be connected to the data asset via aIngestionConnectionorVirtualizationConnectionto support queries, health monitoring, etc.Attributes
- id: str
The
Dataset’s unique ID- title: str
The title of the
Dataset- description: Optional[str]
The full description text (supports Markdown) that helps describe this
Dataset- uploader_id: str
- source_id: str
- source: str
- organization_id: str
The unique ID of the
Organizationwhich thisDatasetbelongs to- organization: Organization
The
Organizationwhich thisDatasetbelongs to- metadata_template: MetadataTemplate
The
MetadataTemplateattached to thisDataset, if any- data_dictionary: DataDictionary
The
DataDictionarydefined within thisDataset, or describing the schema of the connected data if this is aConnectedDataset- created_at: datetime
The date this
Datasetwas originally created- updated_at: datetime
The date this
Dataset‘s metadata was last modified
- attach_template(template: MetadataTemplate)[source]
Attach a
MetadataTemplateto thisDataset. Values may be supplied to templated fields immediately, but the template will only be attached when class:.Dataset .save() is called.Parameters
- templateMetadataTemplate
The
MetadataTemplateto be attached to theDataset
Returns
- Dataset
The
Datasetwith a newly attachedMetadataTemplate
- classify(topic: Topic) None[source]
Classify this
Datasetwith aTopic, linking them semanticallyParameters
Returns
None
Raises
- connect() DatasetConnector[source]
Converts a
Datasetinto a ConnectedDataset, by accessing data via anIngestionConnectionorVirtualizationConnection. AConnectedDatasetcan represent ingested data, which is copied into theCatalogplatform, or virtualized data which is accessed remotely by the platform without being copied.There are many methods for connecting a
Dataset, thus a helper object is returned with various method-based workflows that aid in connecting to data.Returns
- DatasetConnector
A helper object for configuring this
Dataset‘s connection to data.
- property custom_metadata: List[MetadataField]
A list of
MetadataFields attached to thisDatasetthat are not associated with an attachedMetadataTemplate
- declassify(topic: Topic) None[source]
Remove a
Topicclassification from thisDatasetParameters
Returns
None
Raises
- delete() None[source]
Delete this
Dataset. TheDatasetobject should not be used after this method is invoked successfully.Raises
- detach_template()[source]
Remove the attached
MetadataTemplatefrom thisDataset. Any fields from thisMetadataTemplatewill remain on theDatasetbut as individualMetadataFields. Detachment happens instantly and callingDataset.save() is not necessary for the changes to persistParameters
None
Returns
- Dataset
The
Datasetwith no attachedMetadataTemplate
- classmethod get(client: Catalog, id: str, context_organization: organization.Organization | None = None)[source]
Retrieve a
DatasetParameters
- clientCatalog
- idstr
The unique ID of the
Dataset- context_organizationOptional[Organization]
The
Organizationfrom which thisDatasetis being retrieved.Dataset‘s may be accessible from multipleOrganization‘s, but can have differing metadata within each. This context parameter is necessary to determine which metadata to load.
Returns
- Dataset
The
Datasetassociated with the given ID
Raises
- list_topics(organization_id: str | None = None, filter: Filter | None = None) List[Topic][source]
Retrieves the list of all
Topics thisDatasetis currently classified under, within the givenOrganizationParameters
- organization_idOptional[str]
An optional ID for an
Organizationother than the originalOrganizationtheDatasetwas created in (e.g. if theDatasethas been shared to another organization with a different set ofTopics)- filterOptional[Filter]
An optional
tdw_catalog.utils.Filterto offset or limit the list ofTopics returned
Returns
Raises
- CatalogPermissionDeniedException
If the caller is not allowed to list
Topics in thisOrganization- CatalogException
If call to the
Catalogserver fails
- refresh() Dataset[source]
Return a fresh copy of this
Dataset, with up-to-date property values. Useful after performing an update, connection, etc.
- property templated_metadata: List[MetadataField]
A list of
MetadataFields attached to thisDatasetthat are associated with an attachedMetadataTemplate
- update_custom_metadata() MetadataEditor[source]
Provides a
MetadataEditorwhich allows for the addition, removal, and alteration ofMetadataFields on thisDatasetthat are not associated with an attachedMetadataTemplateParameters
None
Returns
- MetadataEditor
An editor for adding, removing, and updating
MetadataFields on theDatasetwhich do not belong to aMetadataTemplate
- update_templated_metadata() TemplatedMetadataEditor[source]
Provides a
TemplatedMetadataEditorwhich allows for the alteration ofMetadataFields on thisDatasetthat are associated with an attachedMetadataTemplate. This object cannot add or removeMetadataFields, that must be done on theMetadataTemplatedirectly.Parameters
None
Returns
- TemplatedMetadataEditor
An editor for updating
MetadataFields on theDatasetthat are associated with an attachedMetadataTemplate
dataset_connector
- class tdw_catalog.dataset_connector.DatasetConnector(d: Dataset | ConnectedDataset)[source]
Bases:
objectA helper object for configuring a
Dataset‘s connection to data. Can either connect aDatasetfor the first time, or reconnect an already-connectedDatasetto different data.- async ingest_from_file(local_file_path: str, connection: IngestionConnection | None = None, target_warehouse: TargetWarehouse | None = None) ConnectedDataset[source]
Async function which uploads a local file to the
Catalogplatform and ingests it, connecting thisDatasetto that ingested data.Parameters
- file_pathstr
The path to the file on disk. The file will be streamed from disk, rather than read into memory, to ensure large files upload successfully.
- connectionOptional[IngestionConnection]
Optionally specify a file upload-type IngestionConnection for use. This
IngestionConnectionmust reside within the existingDataset‘s Source, and must be of the correct type (ConnectionPortalType.IMPORT_LITE). If not provided, the first available file upload Connection within theDataset’s source will be used, or one will be created if none are available.- warehouseOptional[TargetWarehouse]
Optionally specify a target warehouse to ingest to. If omitted, the
TargetWarehousespecified by theIngestionConnectionwill be used, or the defaultTargetWarehousefor theOrganizationif theIngestionConnectiondoes not specify a defaultTargetWarehouse.
Returns
- ConnectedDataset
The newly connected
Dataset, if it was not connected previously, or an updated version of the existingConnectedDatasetif it was connected previously. FurtherDatasetoperations should be performed on this returned object.
Raises
- FileNotFoundError
If the specified file_path does not exist
- CatalogPermissionDeniedException
If the caller is not allowed to perform any of the steps involved in ingest data from a file
- CatalogInvalidArgumentException
If the given
IngestionConnectioncannot be used- CatalogException
If call to the
Catalogserver fails, or the ingest process itself fails
data_dictionary
- class tdw_catalog.data_dictionary.Column(key: str = None, type: ColumnType = None, name: str | None = None, description: str | None = None)[source]
Bases:
objectA single
Columnwithin aDataDictionaryAttributes
- keystr
The column name for this
Column, within the actualWarehousewhere the data lives- typeColumnType
The data type for this
Column. Available types can be found inColumnType.- name: Optional[str]
An optional friendly name for this
Column, which is visually used in place of thekeythroughout theCatalog- description: Optional[str]
An optional description for this
Column
- apply_glossary_term(glossary_term: glossary_term.GlossaryTerm) None[source]
Apply a
GlossaryTermto thisColumn. The containingDataDictionarymust be saved for the change to take permanent effect.Parameters
- glossary_termGlossaryTerm
The
GlossaryTermto classify thisColumnwith
Returns
None
Raises
- CatalogInvalidArgumentException
If the
Organizationof theGlossaryTermdoes not match theOrganizationwhich theDatasetwas retrieved from.
- list_glossary_terms() List[glossary_term.GlossaryTerm][source]
Return a list of
GlossaryTerms that have been applied to thisColumnParameters
None
Returns
- List[glossary_term.GlossaryTerm]
The list of
GlossaryTerms that have been applied to thisColumn
Raises
- CatalogPermissionDeniedException
If the caller does not have permission to list
GlossaryTerms on aDataset‘sColumns- CatalogInternalException
If call to the
Catalogserver fails
- remove_glossary_term(glossary_term: glossary_term.GlossaryTerm) None[source]
Remove a
GlossaryTermfrom thisColumn. The containingDataDictionarymust be saved for the change to take permanent effect.Parameters
- glossary_termGlossaryTerm
The
GlossaryTermto be removed from thisColumn
Returns
None
- class tdw_catalog.data_dictionary.CurrencyColumn(key: str = None, type: ColumnType = None, name: str | None = None, description: str | None = None, symbol: str | None = None)[source]
Bases:
ColumnA currency-specific extension of
Column, with an added currency symbol (such as $)Attributes
- symbolOptional[str]
An optional currency symbol (e.g.
'$')
- class tdw_catalog.data_dictionary.DataDictionary(dataset: Dataset, last_updated_at: datetime, version_id: str | None, columns: List[Column])[source]
Bases:
objectA
DataDictionarydescribes the schema of data represented by aDatasetas a sequence ofColumns, each with akey,title,type, and optionaldescription.A
DataDictionarybehaves as adict- columns can be accessed via their key as follows:data_dictionary["column_name"].Attributes
- last_updated_at: datetime
The last time this
DataDictionarywas updated, either by hand (forDatasets which are not connected) or via a schedule metrics collection (forConnectedDatasets which are)- columns: List[Column]
The list of
Columns which make up thisDataDictionary
- columns() List[Column][source]
Returns all
Columns in thisDataDictionary
- has_key(key: str) bool[source]
Returns
trueif and only if aColumnwith the givenkeyexists in thisDataDictionary
- property last_updated_at: datetime
Returns the last time this
DataDictionarywas modified
- save()[source]
Update this
DataDictionary, saving all changes to its schemaRaises
- CatalogPermissionDeniedException
If the caller is not allowed to update this
DataDictionary- CatalogException
If call to the
Catalogserver fails
- class tdw_catalog.data_dictionary.MetadataOnlyColumn(key: str = None, type: ColumnType = None, name: str | None = None, description: str | None = None)[source]
Bases:
ColumnIdentical to
Column, but within aMetadataOnlyDataDictionaryattached to aDatasetwhich is not connected to data. When not connected, all aspects of a data dictionary can be freely modified (includingkeyandtype), as there is no underlying data providing/constraining the dictionary.Attributes
- keystr
The column name for this
Column, within the actualWarehousewhere the data lives- typeColumnType
The data type for this
Column. Available types can be found inColumnType.- name: str
An optional friendly name for this
Column, which is visually used in place of thekeythroughout theCatalog- description: Optional[str]
An optional description for this
Column
- class tdw_catalog.data_dictionary.MetadataOnlyCurrencyColumn(key: str = None, type: ColumnType = None, name: str | None = None, description: str | None = None, symbol: str | None = None)[source]
Bases:
CurrencyColumn,MetadataOnlyColumnThe
MetadataOnlyColumnversion ofCurrencyColumnAttributes
- symbolOptional[str]
The currency symbol
- class tdw_catalog.data_dictionary.MetadataOnlyDataDictionary(dataset: Dataset, last_updated_at: datetime, version_id: str | None, columns: List[Column])[source]
Bases:
DataDictionaryA
MetadataOnlyDataDictionaryis identical to aDataDictionary, but is attached to aDatasetwhich is not connected to data.Because the
Datasetis not connected, all aspects of the dictionary can be modified freely, including column keys, types, etc. (because they are not constrained by existing underlying data).A
MetaDataOnlyDataDictionarybehaves as adict- columns can be accessed (and overwritten) via their key as follows:data_dictionary["column_name"] = ....Attributes
- last_updated_at: datetime
The last time this
DataDictionarywas updated, either by hand (forDatasets which are not connected) or via a schedule metrics collection (forConnectedDatasets which are)- columns: List[MetadataOnlyColumn]
The list of
MetadataOnlyColumns which make up thisDataDictionary
- add(col: Column, index: int | None = None) MetadataOnlyDataDictionary[source]
Appends a specific
Columnto thisMetadataOnlyDataDictionary, or inserts it at a specificindex.Parameters
Returns
- MetadataOnlyDataDictionary
A reference to itself for method chaining
- clear() MetadataOnlyDataDictionary[source]
Removes all
Columns from thisMetadataOnlyDataDictionaryReturns
- MetadataOnlyDataDictionary
A reference to itself for method chaining
- columns() List[MetadataOnlyColumn][source]
Returns all
Columns in thisMetadataOnlyDataDictionary
- remove(key: str) MetadataOnlyDataDictionary[source]
Removes a specific
Columnfrom thisMetadataOnlyDataDictionaryby keyParameters
- keystr
The key of the
Column
Returns
- MetadataOnlyDataDictionary
A reference to itself for method chaining
errors
- exception tdw_catalog.errors.CatalogAbortedException(*args, message, meta={})[source]
Bases:
CatalogExceptionThe operation was aborted, typically due to a concurrency issue like sequencer check failures, transaction aborts, etc.
- exception tdw_catalog.errors.CatalogAlreadyExistsException(*args, message, meta={})[source]
Bases:
CatalogExceptionAn attempt to create an entity failed because one already exists.
- exception tdw_catalog.errors.CatalogBadRouteException(*args, message, meta={})[source]
Bases:
CatalogExceptionThe requested URL path wasn’t routable to a known method. This is returned by generated server code and should not be returned by application code (use “not_found” or “unimplemented” instead).
- exception tdw_catalog.errors.CatalogCanceledException(*args, message, meta={})[source]
Bases:
CatalogExceptionThe operation was cancelled
- exception tdw_catalog.errors.CatalogDataLossException(*args, message, meta={})[source]
Bases:
CatalogExceptionThe operation resulted in unrecoverable data loss or corruption.
- exception tdw_catalog.errors.CatalogDeadlineExceededException(*args, message, meta={})[source]
Bases:
CatalogExceptionOperation expired before completion. For operations that change the state of the system, this error may be returned even if the operation has completed successfully (timeout).
- exception tdw_catalog.errors.CatalogException(*args, code=Errors.Unknown, message='', meta={})[source]
Bases:
TwirpServerExceptionThe most generic Catalog platform error
- exception tdw_catalog.errors.CatalogFailedPreconditionException(*args, message, meta={})[source]
Bases:
CatalogExceptionThe operation was rejected because the system is not in a state required for the operation’s execution. For example, doing an rmdir operation on a directory that is non-empty, or on a non-directory object, or when having conflicting read-modify-write on the same resource.
- exception tdw_catalog.errors.CatalogInternalException(*args, message, meta={})[source]
Bases:
CatalogExceptionWhen some invariants expected by the underlying system have been broken. In other words, something bad happened in the library or backend service. Twirp specific issues like wire and serialization problems are also reported as “internal” errors.
- exception tdw_catalog.errors.CatalogInvalidArgumentException(*args, message, meta={})[source]
Bases:
CatalogExceptionThe client specified an invalid argument. This indicates arguments that are invalid regardless of the state of the system (i.e. a malformed file name, required argument, number out of range, etc.).
- exception tdw_catalog.errors.CatalogMalformedException(*args, message, meta={})[source]
Bases:
CatalogExceptionThe client sent a message which could not be decoded. This may mean that the message was encoded improperly or that the client and server have incompatible message definitions.
- exception tdw_catalog.errors.CatalogNoErrorException(*args, message, meta={})[source]
Bases:
CatalogException
- exception tdw_catalog.errors.CatalogNotFoundException(*args, message, meta={})[source]
Bases:
CatalogExceptionSome requested entity was not found.
- exception tdw_catalog.errors.CatalogOutOfRangeException(*args, message, meta={})[source]
Bases:
CatalogExceptionThe operation was attempted past the valid range. For example, seeking or reading past end of a paginated collection. Unlike “invalid_argument”, this error indicates a problem that may be fixed if the system state changes (i.e. adding more items to the collection).
- exception tdw_catalog.errors.CatalogPermissionDeniedException(*args, message, meta={})[source]
Bases:
CatalogExceptionThe caller does not have permission to execute the specified operation. It must not be used if the caller cannot be identified (use “unauthenticated” instead).
- exception tdw_catalog.errors.CatalogResourceExhaustedException(*args, message, meta={})[source]
Bases:
CatalogExceptionSome resource has been exhausted or rate-limited, perhaps a per-user quota, or perhaps the entire file system is out of space.
- exception tdw_catalog.errors.CatalogUnauthenticatedException(*args, message, meta={})[source]
Bases:
CatalogExceptionThe request does not have valid authentication credentials for the operation.
Bases:
CatalogExceptionThe service is currently unavailable. This is most likely a transient condition and may be corrected by retrying with a backoff.
- exception tdw_catalog.errors.CatalogUnimplementedException(*args, message, meta={})[source]
Bases:
CatalogExceptionThe operation is not implemented or not supported/enabled in this service.
- exception tdw_catalog.errors.CatalogUnknownException(*args, message, meta={})[source]
Bases:
CatalogExceptionAn unknown error occurred. For example, this can be used when handling errors raised by APIs that do not return any error information.
export
- class tdw_catalog.export.CSVExport[source]
Bases:
_ExportCSVExportrepresents a signed download URL pointing to the CSV-formatted result of aDatasetexport_csv()operation, alongside metadata concerning the exported data.This class is deliberately formatted for use with pandas’
read_csvfunction, as follows:e1 = await dataset.export_csv()anddf = pd.read_csv(e1.url, **e1)Attributes
- query: str
The query statement which was used to create the
Export- created_at: datetime
The time this
Exportwas originally created- started_at: datetime
The time this
Exportwas started- finished_at: datetime
The time this
Exportwas completed- url: str
The CSV-formatted export results can be downloaded via this signed URL
- dtypeDict[str, Type]
Metadata describing the schema of the exported data
- parse_dates: List[str]
A list of columns within
dtypethat should be interpreted as dates- true_valuesList[str]
A list of values to interpret as “truthy”
- false_valuesList[str]
A list of values to interpret as “falsey”
- compressionOptional[str]
Indicates the compression format of the data, if any
glossary_term
- class tdw_catalog.glossary_term.GlossaryTerm(client, **kwargs)[source]
Bases:
EntityBase,_OrganizationRelationGlossaryTerms are used to categorize and classify columns withinDatasetsAttributes
- idstr
GlossaryTerm‘s unique id- organization_idstr
The unique ID of the
Organizationto which thisGlossaryTermbelongs- user_idstr
The unique ID of the
Userwho created thisGlossaryTerm- titlestr
The title for this
GlossaryTerm- description: Optional[str]
An Optional description for this
GlossaryTerm- created_atdatetime
The datetime at which this
GlossaryTermwas created- updated_atdatetime
The datetime at which this
GlossaryTermwas last updated
- delete() None[source]
Delete this
GlossaryTerm. ThisGlossaryTermobject should not be used after delete() has successfully returnedParameters
None
Returns
None
Raises
- CatalogPermissionDeniedException
If the caller is not allowed to delete this
GlossaryTerm- CatalogNotFoundException
If the
GlossaryTermbeing deleted does not exist- CatalogException
If call to the
Catalogserver fails
- save() None[source]
Update this
GlossaryTerm, saving any changes to its titleParameters
None
Returns
None
Raises
- CatalogPermissionDeniedException
If the caller is not allowed to update this
GlossaryTerm, or if the givenGlossaryTermID does not exist- CatalogException
If call to the
Catalogserver fails
list_datasets
- class tdw_catalog.list_datasets.DatasetAlias(alias_key: str, alias_values: List[str])[source]
Bases:
objectUsed to sort the results of
list_datasetsby specific aliases
- class tdw_catalog.list_datasets.Filter(limit: int = None, offset: int = None, keywords: List[str] | None = None, dataset_ids: List[str] | None = None, dataset_aliases: List[DatasetAlias] | None = None, reference_ids: List[str] | None = None, sources: List[Source] | None = None, topics: List[Topic] | None = None, creators: List[OrganizationMember] | None = None, states: List[ImportState] | None = None, warehouses: List[Warehouse] | None = None, timestamp_range: TimestampRange | None = None, sort: Sort | None = None)[source]
Bases:
LegacyFilterListOrganizationDatasetsFilterfilters the results fromlist_datasetsonOrganization.Attributes
- keywordsOptional[List[str]]
Filters results according to the specified keyword(s) (fuzzy matching is supported)
- dataset_idsOptional[List[str]]
Filters results to the list of given
Datasetid(s)- datset_aliasesOptional[List[DatasetAlias]]
Filters results to the list of given
Datasetalias(es)- sourcesOptional[List[Source]]
Filters results to the list of given
Source(s)- topicsOptional[List[Topic]]
Filters results to the list of given
Topic(s)- creatorsOptional[List[OrganizationMember]]
Filters results to the list of given
OrganizationMember(s), who created the returnedDatasets- stateOptional[List[ImportState]]
Filters results to the list of given
ImportStates. Note that virtualized datasets will always be categorized asIMPORTED.- warehouses: Optional[List[Warehouse]]
Filters results to the list of given
Warehouses- timestamp_rangeOptional[TimestampRange]
Filters results to the within the given
TimestampRange- sortOptional[Sort]
Sorts filtered results according to the provided
Sortstructure
- class tdw_catalog.list_datasets.Sort(field: SortableField, order: FilterSortOrder | None = FilterSortOrder.ASC)[source]
Bases:
objectUsed to sort the results of
list_datasetsonOrganization.
- enum tdw_catalog.list_datasets.SortableField(value)[source]
Bases:
StrEnumThe different fields which
list_datasetsonOrganizationcan be sorted by- Member Type:
str
Valid values are as follows:
- TITLE = <SortableField.TITLE: 'title'>
- CREATED_AT = <SortableField.CREATED_AT: 'created_at'>
- IMPORTED_AT = <SortableField.IMPORTED_AT: 'imported_at'>
- UPDATED_AT = <SortableField.UPDATED_AT: 'updated_at'>
- STATE = <SortableField.STATE: 'reference_state'>
- NEXT_INGEST = <SortableField.NEXT_INGEST: 'reference_next_ingest'>
- FAILED_AT = <SortableField.FAILED_AT: 'reference_failed_at'>
- SOURCE_NAME = <SortableField.SOURCE_NAME: 'source_label'>
- enum tdw_catalog.list_datasets.TimestampField(value)[source]
Bases:
IntEnumThe different possible fields that can be used to construct a
TimestampRangefilter forlist_datasetsonOrganization- Member Type:
int
Valid values are as follows:
- CREATED_AT = <TimestampField.CREATED_AT: 0>
- UPDATED_AT = <TimestampField.UPDATED_AT: 1>
- IMPORTED_AT = <TimestampField.IMPORTED_AT: 2>
- NEXT_INGEST = <TimestampField.NEXT_INGEST: 3>
- FAILED_AT = <TimestampField.FAILED_AT: 5>
- class tdw_catalog.list_datasets.TimestampRange(filter_by: TimestampField, start_time: datetime | None, end_time: datetime | None)[source]
Bases:
objectUsed to construct a temporal filter for
list_datasetsonOrganization, where a filter specifies aTimestampFieldand a time range
organization
- class tdw_catalog.organization.Organization(client, **kwargs)[source]
Bases:
EntityBaseOrganizations are the primary entrypoints to a DataCatalog, containing and linking togetherOrganizationMembers,Teams,Datasets, etc..Attributes
- titlestr
The name of the
Organization- created_atdatetime
The datetime at which this
Organizationwas created- updated_atdatetime
The datetime at which this
Organizationwas last updated
- create_credential() CredentialFactory[source]
Provides a
CredentialFactorywhich is capable of creatingCredentials within thisOrganization.Parameters
Returns
- CredentialFactory
A factory for creating specific types of
Credentials
- create_dataset(source: Source, title: str, description: str | None = None) Dataset[source]
Creates a new
Datasetwithin thisOrganization. TheDatasetwill have a title and (optionally) a description, and must be associated with aSource. TheDatasetwill otherwise be empty and can be subsequently populated with metadata and data.Parameters
Returns
- Dataset
The newly created
Dataset
Raises
- CatalogPermissionDeniedException
If the caller is not allowed to create
Datasets in thisOrganization- CatalogInvalidArgumentException
If title is an empty string, or if the
Sourcebelongs to a differentOrganizationthan this one- CatalogException
If call to the
Catalogserver fails
- create_glossary_term(title: str, description: str | None = None) GlossaryTerm[source]
Create a
GlossaryTermwithin thisOrganizationParameters
- title: str
The name of the new
GlossaryTerm- description: Optional[str]
The description of the new
GlossaryTerm
Returns
- GlossaryTerm
The newly created
GlossaryTerm
Raises
- CatalogPermissionDeniedException
If the caller is not allowed to create
GlossaryTerms in thisOrganization- CatalogInvalidArgumentException
If title is an empty string
- CatalogAlreadyExistsException
If a
GlossaryTermwith the provided title already exists in thisOrganization- CatalogException
If call to the
Catalogserver fails
- create_lineage(upstream_dataset: Dataset, downstream_dataset: Dataset, label: str, description: str | None = None, column_lineage: List[tuple[Union[str, List[str]], Union[str, List[str]]]] = []) DatasetLineageRelationship[source]
Create a
DatasetLineageRelationshipwithin thisOrganization. Each relationship describes a single source and destinationDataset(a single “edge” in the lineage graph), with optional column-level lineage.Branching (or many-to-many) relationships can be modelled by decomposing them into their individual edges.
Parameters
- upstream_datasetDataset
The source dataset involved in this
DatasetLineageRelationship- downstream_datasetDataset
The destination dataset involved in this
DatasetLineageRelationship- labelstr
A label describing this
DatasetLineageRelationship- descriptionOptional[str]
An optional description providing further details about this
DatasetLineageRelationship- column_lineageList[tuple[Union[str, List[str]],Union[str,List[str]]]]
An optional list of column-level associations between the two
Datasets, specified as tuples. Each tuple is a single column-level relationship between a list of upstream columns and a list of downstream columns. This argument defaults to the emptyListif not supplied. Example:[("address", ["street_number","street_name","city"])]
Returns
- DatasetLineageRelationship
The newly created
DatasetLineageRelationship
Raises
- CatalogInvalidArgumentException
If any specified column names within provided column lineage do not actually exist in the provided
Datasets- CatalogPermissionDeniedException
If the caller is not allowed to define lineage in this
Organization, or if they do not have access to one of the involvedDatasets- CatalogException
If call to the
Catalogserver fails
- create_metadata_template(title: str, description: str | None = None) MetadataTemplateCreationBuilder[source]
Provides a
MetadataTemplateCreationBuilderwhich is capable of creatingMetadataTemplates within thisOrganization.Parameters
- titlestr
The title for the
MetadataTemplate- descriptionOptional[str]
An optional description for the
MetadataTemplate
Returns
- MetadataTemplateCreationBuilder
A factory for creating new
MetadataTemplates
- create_or_replace_lineage(upstream_dataset: Dataset, downstream_dataset: Dataset, label: str, description: str | None = None, column_lineage: List[tuple[List[str], List[str]]] = []) DatasetLineageRelationship[source]
Create a
DatasetLineageRelationshipwithin thisOrganization. Each relationship describes a single source and destinationDataset(a single “edge” in the lineage graph), with optional column-level lineage.Branching (or many-to-many) relationships can be modelled by decomposing them into their individual edges.
If no relationships between the given
Datasets exist, one will be created. Unlikecreate_lineage, pre-existing relationships between the givenDatasets will be cleared and replaced by this one, facilitating easy one-way syncs from an external lineage metdata source and theCatalogplatform.Parameters
- upstream_datasetDataset
The source dataset involved in this
DatasetLineageRelationship- downstream_datasetDataset
The destination dataset involved in this
DatasetLineageRelationship- labelstr
A label describing this
DatasetLineageRelationship- descriptionOptional[str]
An optional description providing further details about this
DatasetLineageRelationship- column_lineageList[tuple[Union[str, List[str]],Union[str,List[str]]]]
An optional list of column-level associations between the two
Datasets, specified as tuples. Each tuple is a single column-level relationship between a list of upstream columns and a list of downstream columns. This argument defaults to the emptyListif not supplied. Example:[("address", ["street_number","street_name","city"])]
Returns
- DatasetLineageRelationship
The newly created
DatasetLineageRelationship
Raises
- CatalogInvalidArgumentException
If any specified column names within provided column lineage do not actually exist in the provided
Datasets- CatalogPermissionDeniedException
If the caller is not allowed to define lineage in this
Organization, or if they do not have access to one of the involvedDatasets- CatalogException
If call to the
Catalogserver fails
- create_source(label: str, description: str | None = None) Source[source]
Create a
Sourcewithin thisOrganizationParameters
Returns
- Source:
The newly created
Source
Raises
- CatalogInternalException
If call to the
Catalogserver fails
- create_team(title: str) Team[source]
Create a
Teamwithin thisOrganizationParameters
- title: str
The name of the new
Team
Returns
- Team
The newly created
Team
Raises
- CatalogPermissionDeniedException
If the caller is not allowed to create
Teams in thisOrganization- CatalogException
If call to the
Catalogserver fails
- create_topic(title: str) Topic[source]
Create a
Topicwithin thisOrganizationParameters
- title: str
The name of the new
Topic
Returns
- Topic
The newly created
Topic
Raises
- CatalogPermissionDeniedException
If the caller is not allowed to create
Topics in thisOrganization- CatalogException
If call to the
Catalogserver fails
- delete() None[source]
Delete this
Organization. ThisOrganizationobject should not be used after delete() has successfully returned, as theCatalogorganization it represents will no longer exist.Parameters
None
Returns
None
Raises
- CatalogPermissionDeniedException
If the caller is not allowed to delete this
Organization- CatalogException
If call to the
Catalogserver fails
- get_connection(id: str) IngestionConnection | VirtualizationConnection[source]
Retrieve the given
IngestionConnectionorVirtualizationConnectionfrom thisOrganizationParameters
- team_idstr
The unique ID of the Connection
Returns
- Union[IngestionConnection,VirtualizationConnection]
The Connection with the given ID
Raises
- CatalogPermissionDeniedException
If the caller is not allowed to retrieve Connections from this
Organization- CatalogNotFoundException
If the given Connection ID does not exist
- CatalogException
If call to the
Catalogserver fails
- get_credential(credential_id: str) Credential[source]
Retrieve a
Credentialbelonging to thisOrganizationParameters
- credential_idstr
The unique ID of the
Credential
Returns
- Credential
The
Credentialassociated with the given ID
Raises
- CatalogPermissionDeniedException
If the caller is not allowed to retrieve
Credentials- CatalogNotFoundException
If the given
CredentialID does not exist- CatalogException
If call to the
Catalogserver fails
- get_dataset(id: str) Dataset | ConnectedDataset[source]
Retrieve the given
Datasetfrom thisOrganizationParameters
- idstr
The unique ID of the
Dataset
Returns
- Dataset
The
Datasetwith the given ID
Raises
- get_glossary_term(id: str) GlossaryTerm[source]
Retrieve the given
GlossaryTermfrom thisOrganizationParameters
- idstr
The unique ID of the
GlossaryTerm
Returns
- GlossaryTerm
The
GlossaryTermwith the given ID
Raises
- CatalogPermissionDeniedException
If the caller is not allowed to retrieve
GlossaryTerms from thisOrganization, or if the givenGlossaryTermID does not exist- CatalogException
If call to the
Catalogserver fails
- get_lineage(id: str) DatasetLineageRelationship[source]
Retrieve the given
DatasetLineageRelationshipfrom thisOrganizationParameters
- idstr
The unique ID of the
DatasetLineageRelationship
Returns
- DatasetLineageRelationship
The
DatasetLineageRelationshipwith the given ID
Raises
- CatalogPermissionDeniedException
If the caller is not allowed to retrieve
DatasetLineageRelationships from thisOrganization- CatalogInvalidArgumentException
If the given
DatasetLineageRelationshipID does not exist- CatalogException
If call to the
Catalogserver fails
- get_member(user_id: str) OrganizationMember[source]
Retrieve the a specific member (
User) of thisOrganizationParameters
- user_idstr
The unique
UserID of theOrganizationMember
Returns
- OrganizationMember
The
OrganizationMemberwith the givenUserID
Raises
- CatalogPermissionDeniedException
If the caller is not allowed to fetch
OrganizationMembers- CatalogInvalidArgumentException
If the given
UserID does not exist or is not a member of thisOrganization- CatalogException
If call to the
Catalogserver fails
- get_metadata_template(id: str) MetadataTemplate[source]
Retrieve a
MetadataTemplatebelonging to thisOrganizationParameters
- idstr
The unique ID of the
MetadataTemplate
Returns
- MetadataTemplate
The
MetadataTemplateassociated with the given ID
Raises
- CatalogPermissionDeniedException
If the caller is not allowed to retrieve
MetadataTemplates- CatalogNotFoundException
If the given
MetadataTemplateID does not exist- CatalogException
If call to the
Catalogserver fails
- get_source(id: str) Source[source]
Retrieve a
Sourcebelonging to thisOrganizationParameters
- idstr
The unique ID of the
Source
Returns
- Source
The
Sourceassociated with the given ID
Raises
- get_team(team_id: str) Team[source]
Retrieve the given
Teamfrom thisOrganizationParameters
- team_idstr
The unique ID of the
Team
Returns
- Team
The
Teamwith the given ID
Raises
- CatalogPermissionDeniedException
If the caller is not allowed to retrieve
Teams from thisOrganization- CatalogInvalidArgumentException
If the given
TeamID does not exist- CatalogException
If call to the
Catalogserver fails
- get_topic(id: str) Topic[source]
Retrieve the given
Topicfrom thisOrganizationParameters
- idstr
The unique ID of the
Topic
Returns
- Topic
The
Topicwith the given ID
Raises
- CatalogPermissionDeniedException
If the caller is not allowed to retrieve
Topics from thisOrganization, or if the givenTopicID does not exist- CatalogException
If call to the
Catalogserver fails
- invite_member(user_id: str, roles: OrganizationMemberRoles = None) OrganizationMember[source]
Invite the given
Userto be anOrganizationMemberof thisOrganizationParameters
Returns
- OrganizationMember
The newly created
OrganizationMember
Raises
- CatalogPermissionDeniedException
If the caller is not allowed to invite
OrganizationMembers- CatalogAlreadyExistsException
If the caller is inviting a
Userwho is already anOrganizationMemberof thisOrganization- CatalogInvalidArgumentException
If the given
UserID does not exist- CatalogException
If call to the
Catalogserver fails
- invite_members(emails: List[str], invite_message: str | None = '', raise_on_failure: bool | None = False, roles: OrganizationMemberRoles | None = None) InviteMembersResponse[source]
Invite the given
User(s) to becomeOrganizationMembers of thisOrganization. If a given email does not correspond to an existingUser, an invitation to theCatalogplatform will be sent via email.Parameters
- emails
The list of email addresses of the invitees.
- invite_messageOptional[str]
The message to send the users when sending the invitation
- raise_on_failureOptional[bool]
Whether to raise an exception on a failure of any one invite
- rolesOptional[OrganizationMemberRoles]
The roles the new members will take when invited
Returns
- InviteMembersResponse
This contains a summary of the successful and failed invitations.
Raises
- CatalogPermissionDeniedException
If the caller is not allowed to invite
OrganizationMembers- CatalogException
If call to the
Catalogserver fails
- list_connections(filter: ListConnectionsFilter | None = None) List[IngestionConnection | VirtualizationConnection][source]
List all
VirtualizationConnectionandIngestionConnections in thisOrganizationParameters
- filterOptional[Filter]
An optional Filter on the returned Connection list, useful for pagination of results. Note that the organization_id property will be set automatically to this
Organization.
Returns
- List[Union[IngestionConnection,VirtualizationConnection]]
The list of Connections in this
Organization
Raises
- CatalogPermissionDeniedException
If the caller is not allowed to list Connections in this
Organization- CatalogException
If call to the
Catalogserver fails
- list_credentials(filter: LegacyFilter = None) List[Credential][source]
List
Credentials which belong to the givenOrganizationReturns
- List[Credential]
Credentials created under thisOrganization
Raises
- CatalogPermissionDeniedException
If the caller is not allowed to list
Credentials- CatalogException
If call to the
Catalogserver fails
- list_datasets(filter: Filter | None = None) Iterator[Dataset][source]
Retrieve the list of
Datasets which belong to theOrganization. The maximum number of results is limited, and must be paginated via thefilterto obtain additional results.Parameters
- filterOptional[list_datasets.Filter]
An optional filter on the returned
Datasets (None by default)
Returns
- Iterator[Dataset]
An Iterator of
Datasets belonging to thisOrganization, which are lazily fetched as the Iterator is iterated.
Raises
- list_external_warehouses() List[ExternalWarehouse][source]
Retrieve the list of known
ExternalWarehouses available to thisOrganizationParameters
None
Returns
- List[ExternalWarehouse]
ExternalWarehouses that are available to thisOrganization
Raises
- CatalogPermissionDeniedException
If the caller is not allowed to list
ExternalWarehouses (the caller must be anOrganizationadmin, or haveDatasetcreation privileges)- CatalogException
If call to the
Catalogserver fails
- list_glossary_terms(organization_ids: List[str] | None = None, filter: ListGlossaryTermsFilter | None = None) List[GlossaryTerm][source]
List all
GlossaryTerms in thisOrganizationParameters
- organization_ids: Optional[List[str]]
An optional list of
OrganizationID’s to list GlossaryTerms from multipleOrganizations- filterOptional[ListGlossaryTermsFilter]
An optional
ListGlossaryTermsFilteron the returnedGlossaryTermlist, useful for pagination of results
Returns
- List[GlossaryTerm]
The list of
GlossaryTerms in thisOrganization
Raises
- CatalogPermissionDeniedException
If the caller is not allowed to list
GlossaryTerms in thisOrganization- CatalogException
If call to the
Catalogserver fails
- list_members(filter: LegacyFilter | None = None) List[OrganizationMember][source]
Retrieve all
OrganizationMembers of thisOrganizationParameters
None
Returns
- List[OrganizationMember]
The
OrganizationMembers which are a member of thisOrganization
Raises
- CatalogPermissionDeniedException
If the caller is not allowed to list
OrganizationMembers- CatalogException
If call to the
Catalogserver fails
- list_metadata_templates() List[MetadataTemplate][source]
List all
MetadataTemplates which belong to the givenOrganizationReturns
- List[MetadataTemplate]
MetadataTemplates created under thisOrganization
Raises
- CatalogPermissionDeniedException
If the caller is not allowed to list
MetadataTemplates- CatalogException
If call to the
Catalogserver fails
- list_sources(filter: ListSourcesFilter | None = None) List[Source][source]
List Sources which belong to the given
OrganizationParameters
- filter:SourcesFilter
The
SourceFilterto be used when performing the search
Returns
- List[Source]
Sources created under thisOrganization
Raises
- CatalogException
If call to the
Catalogserver fails
- list_target_warehouses() List[TargetWarehouse][source]
Retrieve the list of known
TargetWarehouses available to thisOrganizationParameters
None
Returns
- List[TargetWarehouse]
TargetWarehouses that are available to thisOrganization
Raises
- CatalogPermissionDeniedException
If the caller is not allowed to list
TargetWarehouses (the caller must be anOrganizationadmin, or haveDatasetcreation privileges)- CatalogException
If call to the
Catalogserver fails
- list_teams(organization_ids=None, filter: LegacyFilter | None = None) List[Team][source]
List all Teams in this
OrganizationParameters
- filterOptional[LegacyFilter]
An optional filter on the returned
Teamlist, useful for pagination of results
Returns
- List[Team]
The list of
Teams in thisOrganization
Raises
- CatalogPermissionDeniedException
If the caller is not allowed to list
Teams in thisOrganization- CatalogException
If call to the
Catalogserver fails
- list_topics(filter: LegacyFilter = None) List[Topic][source]
List all
Topics in thisOrganizationParameters
- filterOptional[LegacyFilter]
An optional filter on the returned
Topiclist, useful for pagination of results
Returns
- List[Topic]
The list of
Topics in thisOrganization
Raises
- CatalogPermissionDeniedException
If the caller is not allowed to list
Topics in thisOrganization- CatalogException
If call to the
Catalogserver fails
- save() None[source]
Update this
Organization, saving any changes to its titleParameters
None
Returns
None
Raises
- CatalogPermissionDeniedException
If the caller is not allowed to update this
Organization- CatalogException
If call to the
Catalogserver fails
organization_member
- class tdw_catalog.organization_member.OrganizationMember(client, **kwargs)[source]
Bases:
User,_OrganizationRelationAn
OrganizationMemberreflects a relationship betweenUserandOrganization, where theUserhas been invited to theOrganizationand been granted specific privileges within theOrganization.Attributes
- user_idstr
The unique user ID of the
OrganizationMember- organizationorganization.Organization
The
Organizationobject that relates to the organization_id of this model- organization_idstr
The unique ID of the
Organizationto which thisOrganizationMemberbelongs- roles: OrganizationMemberRoles
The roles this Member has within their
Organization- created_atdatetime
The datetime at which this
OrganizationMemberwas added to theOrganization- updated_atdatetime
The datetime at which this
OrganizationMemberwas last updated
- delete() None[source]
Remove this
OrganizationMemberfrom theOrganization. ThisOrganizationMemberobject should not be used after delete() returns successfully.Parameters
None
Returns
None
Raises
- CatalogPermissionDeniedException
If the caller is not allowed to delete this
OrganizationMember, or if the caller is attempting to delete themselves- CatalogException
If call to the
Catalogserver fails
- classmethod get(client: Catalog, organization_id: str, id: str)[source]
Retrieve an
OrganizationMemberbelonging to thisOrganizationParameters
- clientCatalog
The
Catalogclient of theOrganizationcontaining theOrganizationMember- organization_idstr
The unique ID of the
Organization- idstr
The unique ID of the
OrganizationMember
Returns
- OrganizationMember
The
OrganizationMemberassociated with the given ID
Raises
- CatalogInternalException
If call to the
Catalogserver fails- CatalogNotFoundException
If no
OrganizationMemberis found matching the provided ID- CatalogPermissionDeniedException
If the caller is not allowed to retrieve this
OrganizationMember
- get_teams(filter: LegacyFilter | None = None) List[Team][source]
Retrieve the
Teams to which thisOrganizationMemberbelongsParameters
- filterOptional[LegacyFilter]
An optional filter on the returned
Teamlist, useful for pagination of results
Returns
- List[Team]
The list of
Teams to which thisOrganizationMemberbelongs
Raises
- CatalogPermissionDeniedException
If the caller is not allowed to retrieve
Teams from thisOrganization- CatalogException
If call to the
Catalogserver fails
- save() None[source]
Update this
OrganizationMember, saving any changes to its rolesParameters
None
Returns
None
Raises
- CatalogPermissionDeniedException
If the caller is not allowed to update this
OrganizationMember- CatalogAlreadyExistsException
If the caller is attempting to invite a member which is already a member of the
Organization- CatalogException
If call to the
Catalogserver fails
- class tdw_catalog.organization_member.OrganizationMemberRoles(role_data_uploader: bool = False, role_data_viewer: bool = False, role_data_editor: bool = False, role_member_manager: bool = False, role_organization_manager: bool = False, role_admin: bool = False, role_topic_manager: bool = False, role_field_template_manager: bool = False)[source]
Bases:
objectOrganizationMemberRolesdefines the roles which anOrganizationMemberhas within anOrganizationAttributes
- role_data_uploaderbool
Whether this
OrganizationMemberis allowed to upload to theOrganization- role_data_viewerbool
Whether this
OrganizationMemberis allowed to view Datasets within theOrganization- role_data_editorbool
Whether this
OrganizationMemberis allowed to modify Datasets within theOrganization- role_member_managerbool
Whether this
OrganizationMemberis allowed to manage members within theOrganization- role_organization_managerbool
Whether this
OrganizationMemberis allowed to manage theOrganization- role_adminbool
Whether this
OrganizationMemberis anOrganizationadministrator- role_topic_managerbool
Whether this
OrganizationMemberis allowed to manageTopics within theOrganization- role_field_template_managerbool
Whether this
OrganizationMemberis allowed to manageMetadataTemplates
organization_utils
- class tdw_catalog.organization_utils.InviteMembersResponse(failed_invitations: List[InviteMembersResponseFailedInvitation], successful_invitations: List[OrganizationMember])[source]
Bases:
objectInviteMembersResponsecontains the successfully invited members and summarizes any failed invitations.Attributes
- failed_invitations: List[InviteMembersResponseFailedInvitee]
List of email addresses and error message summaries of the failed invitations.
- successful_invitationList[organization_member.OrganizationMember]
List of members which were successfully invited to the
Organization.
- class tdw_catalog.organization_utils.InviteMembersResponseFailedInvitation(email: str, error_message: str)[source]
Bases:
objectInviteMembersResponseFailedInvitationis a container for a single failed invitation, providing information about why that invitation failed to send.Attributes
- email: str
The email address of the invitee
- error_messagestr
A message indicating why the invitation failed to send.
query
- class tdw_catalog.query.QueryCursor(res: Dict[str, any])[source]
Bases:
objectQueryCursoris a Python DB API-style Cursor object (PEP 249) for query results from theCatalog.Attributes
- arraysize: number
Read/write attribute that controls the number of rows returned by fetchmany(). The default value is 1 which means a single row would be fetched per call.
- description: List[tuple]
Read-only attribute that provides the column names of the last query. To remain compatible with the Python DB API, it returns a 7-tuple for each column where the last five items of each tuple are None.
- fetchall() List[tuple][source]
Return all (remaining) rows of a query result as a list. Return an empty list if no rows are available.
- fetchmany(size=None) List[tuple][source]
Return the next set of rows of a query result as a list. Return an empty list if no more rows are available.
The number of rows to fetch per call is specified by the size parameter. If size is not given, arraysize determines the number of rows to be fetched. If fewer than size rows are available, as many rows as are available are returned.
Note there are performance considerations involved with the size parameter. For optimal performance, it is usually best to use the arraysize attribute. If the size parameter is used, then it is best for it to retain the same value from one
fetchmany()call to the next.
source
- class tdw_catalog.source.Source(client, **kwargs)[source]
Bases:
EntityBase,_OrganizationRelationA
Sourceis used to semantically group a set of relatedDatasets.Users are free to label aSourcein a descriptive way to best understand the meaning behind this grouping.Attributes
- idstr
Source’s unique id
- organizationOrganization
The
Organization`associated with this :class:.Source`. AnOrganizationororganization_idcan be provided but not both.- organization_idstr
The unique ID of the
Organizationto which thisSourcebelongs- user_idstr
The unique user ID of the
OrganizationMemberwho created thisSource- labelstr
A descriptive label for this
Source- descriptionOptional[str] = None
An optional extended description for this
Source- created_atdatetime
The datetime at which this
Sourcewas created- updated_atdatetime
The datetime at which this
Sourcewas last updated
- create_ingestion_connection(label: str, portal: ConnectionPortalType, url: str | None = None, description: str | None = None, warehouse: Warehouse | None = None, credential: Credential | None = None, ingest_schedules: List[ConnectionSchedule] | None = None) IngestionConnection[source]
Create an
IngestionConnectionwithin thisSourceParameters
- labelstr
The descriptive label for this
IngestionConnection- portalConnectionPortalType
The method of data access employed by this
IngestionConnection- urlOptional[str]
A canonical URL that points to the location of data resources within the portal
- descriptionOptional[str] = None
An optional extended description for this
IngestionConnection- warehouseOptional[Warehouse]
Datasets created using thisIngestionConnectionwill ingest to thisWarehouseby default (can be overriden at ingest time).- credentialOptional[Credential]
The
Credentialassociated with thisIngestionConnection.- ingest_schedulesOptional[List[ConnectionSchedule]]
Optional
ConnectionSchedules which, when specified, indicate the frequency with which to reingest ingested data. Specific Datasets using thisIngestionConnectionmay override this set of Schedules.
Returns
- IngestionConnection
The newly created IngestionConnection
Raises
- CatalogPermissionDeniedException
If the caller is not allowed to create
IngestionConnections in thisOrganization- CatalogException
If call to the
Catalogserver fails
- delete() None[source]
Delete this
Source. ThisSourceobject should not be used after delete() has successfully returnedRaises
- classmethod get(client, organization_id: str, id: str)[source]
Retrieve a
Sourcebelonging to thisOrganizationParameters
Returns
- Source
The
Sourceassociated with the given ID
Raises
- list_connections(filter: ListConnectionsFilter | None = None) List[IngestionConnection | VirtualizationConnection][source]
List all
IngestionConnectionandVirtualizationConnections belonging to thisSourceParameters
- filterOptional[ListConnectionsFilter]
An optional filter on the returned Connection list, useful for pagination of results. Note that the organization_id and source_ids properties will be set automatically to this
Organizationand Source.
Returns
- List[Connection]
The list of Connections in this
Source
Raises
- CatalogPermissionDeniedException
If the caller is not allowed to list Connections in this
Organization- CatalogException
If call to the
Catalogserver fails
team
- class tdw_catalog.team.Team(client, **kwargs)[source]
Bases:
EntityBase,_OrganizationRelationTeams are sets of OrganizationMembers, with which Datasets can be shared.
Attributes
- idstr
The unique ID of this Team
- organizationorganization.Organization
The
Organization`that relates to the `organization_idon the model- organization_idstr
The unique ID of the
Organizationto which this Team belongs- titlestr
The name of this Team
- created_atdatetime
The datetime at which this Team was created
- updated_atdatetime
The datetime at which this Team was last updated
- add_member(user_id: str, permission: TeamMemberPermissionLevel) TeamMember[source]
Add a
Userto theTeamas a TeamMember. TheUserin question must already be a member of the containing Organization.Parameters
- user_idstr
The unique
UserID of the invitee
Returns
- TeamMember
The newly created TeamMember
Raises
- CatalogPermissionDeniedException
If the caller is not allowed to invite Team members
- CatalogAlreadyExistsException
If the caller is inviting a
Userwho is already a TeamMember of this Team- CatalogInvalidArgumentException
If the given
UserID does not exist- CatalogException
If call to the
Catalogserver fails
- delete() None[source]
Delete this Team. This Team object should not be used after delete() has successfully returned
Parameters
None
Returns
None
Raises
- CatalogPermissionDeniedException
If the caller is not allowed to delete this Team
- CatalogException
If call to the
Catalogserver fails
- classmethod get(client: Catalog, organization_id: str, id: str)[source]
Retrieve an
Teambelonging to thisOrganizationParameters
- clientCatalog
A
Catalogclient- organization_idstr
The unique ID of the
Organization- idstr
The unique ID of the
Team
Returns
- Team
The
Teamassociated with the given ID
Raises
- get_member(user_id: str) TeamMember[source]
Retrieve a specific member (User) of this Team
Parameters
- user_idstr
The unique
UserID of theTeamMember
Returns
- TeamMember
The
TeamMemberwith the givenUserID
Raises
- list_members(filter: LegacyFilter | None = None) List[TeamMember][source]
Retrieve all TeamMembers of this Team
Parameters
None
Returns
- list[TeamMembers]
The
TeamMembers which are a member of this Team
Raises
- CatalogPermissionDeniedException
If the caller is not allowed to list TeamMembers
- CatalogException
If call to the
Catalogserver fails
team_member
- class tdw_catalog.team_member.TeamMember(client, **kwargs)[source]
Bases:
UserA
TeamMemberreflects a relationship betweenUserandTeam, where theUserhas been invited to theTeamand been granted specific privileges within theTeam.Attributes
- teamteam.Team
The
Teamthat relates to the team_id of the model- team_idstr
The unique ID of the
Teamto which thisTeamMemberbelongs- permission: TeamMemberPermissionLevel
- created_atdatetime
The timestamp this
TeamMemberwas added to theTeam- updated_atdatetime
The timestamp this
TeamMemberpermission was changed
- delete() None[source]
Remove this
TeamMemberfrom theTeam. ThisTeamMemberobject should not be used after delete() returns successfully.Parameters
None
Returns
None
Raises
- CatalogPermissionDeniedException
If the caller is not allowed to delete this
TeamMember, or if the caller is attempting to delete themselves- CatalogException
If call to the
Catalogserver fails
- classmethod get(client: Catalog, team_id: str, id: str)[source]
Retrieve an
TeamMemberParameters
- clientCatalog
The
Catalogclient- team_idstr
The unique ID of the
Team- idstr
The unique ID of the
TeamMember
Returns
- TeamMember
The
TeamMemberassociated with the given ID
Raises
- CatalogInternalException
If call to the
Catalogserver fails- CatalogNotFoundException
If no
TeamMemberis found matching the provided ID- CatalogPermissionDeniedException
If the caller is not allowed to retrieve this
TeamMember
- save() None[source]
Update this
TeamMember, saving any changes to its permission levelParameters
None
Returns
None
Raises
- CatalogPermissionDeniedException
If the caller is not allowed to update this
TeamMember- CatalogInvalidArgumentException
If the caller supplies an invalid permission level before saving this
TeamMember- CatalogException
If call to the
Catalogserver fails
topic
- class tdw_catalog.topic.Topic(client, **kwargs)[source]
Bases:
EntityBase,_OrganizationRelationTopics are used to classify Datasets within anOrganization. Classification can be used as a means to apply a grouping label to one or moreDatasets.Attributes
- idstr
Topic’s unique id
- organization_idstr
The unique ID of the
Organizationto which thisTopicbelongs- created_bystr
The unique user ID of the user who created this
Topic- titlestr
The title for this
Topic- created_atdatetime
The datetime at which this
Topicwas created- updated_atdatetime
The datetime at which this
Topicwas last updated
- delete() None[source]
Delete this
Topic. ThisTopicobject should not be used after delete() has successfully returnedParameters
None
Returns
None
Raises
user
- class tdw_catalog.user.User(client, **kwargs)[source]
Bases:
EntityBaseA
Useris a registered user of the ThinkDataCatalog. CurrentlyUsers can only be created through theCataloguser interface, and cannot be created through the API.Users can be added as members ofOrganizations andTeams, and haveDatasets shared with them.Attributes
warehouse
- class tdw_catalog.warehouse.ExternalWarehouse(client, **kwargs)[source]
Bases:
WarehouseAn
ExternalWarehouseis aWarehousewhich is configured for Data Virtualization. New data cannot be written to anExternalWarehouse, but virtualizedDatasets can be created which read from it.Attributes
- database_name: Optional[str]
If set, the database to virtualize tables and views from
- schema: Optional[str]
If set, the schema to virtualize tables and views from
- class tdw_catalog.warehouse.TargetWarehouse(client, **kwargs)[source]
Bases:
WarehouseA
TargetWarehouseis aWarehousewhich is configured for data ingestion. New data can be written to aTargetWarehouse.
- class tdw_catalog.warehouse.Warehouse(client, **kwargs)[source]
Bases:
EntityBaseA
Warehouseis a place whereDatasetsare stored. Currently,Warehouses are configured at the deployment-level and cannot be modified through this SDK.Attributes
- name: str
The unique name of the warehouse in the system. This name will never change for the life of the
Warehouse.- display_name: str
The descriptive name of the
Warehouse.- warehouse_type: str
The type of Warehouse this represents.
- external: Optional[bool]
True if this Warehouse is virtualized within the
Catalog
utils
- enum tdw_catalog.utils.ColumnType(value)[source]
Bases:
StrEnumThe different possible data types for
Columns within aDataDictionary- Member Type:
str
Valid values are as follows:
- BOOLEAN = <ColumnType.BOOLEAN: 'boolean'>
- DATE = <ColumnType.DATE: 'date'>
- DATETIME = <ColumnType.DATETIME: 'datetime'>
- INTEGER = <ColumnType.INTEGER: 'integer'>
- DECIMAL = <ColumnType.DECIMAL: 'decimal'>
- PERCENT = <ColumnType.PERCENT: 'percent'>
- CURRENCY = <ColumnType.CURRENCY: 'currency'>
- STRING = <ColumnType.STRING: 'string'>
- TEXT = <ColumnType.TEXT: 'text'>
- GEOMETRY = <ColumnType.GEOMETRY: 'geometry'>
- GEOJSON = <ColumnType.GEOJSON: 'geojson'>
- enum tdw_catalog.utils.ConnectionPortalType(value)[source]
Bases:
StrEnum- Member Type:
str
Valid values are as follows:
- GS = <ConnectionPortalType.GS: 'Gs'>
- S3 = <ConnectionPortalType.S3: 'S3'>
- UNITY = <ConnectionPortalType.UNITY: 'Unity'>
- FTP = <ConnectionPortalType.FTP: 'Ftp'>
- SFTP = <ConnectionPortalType.SFTP: 'Sftp'>
- EXTERNAL = <ConnectionPortalType.EXTERNAL: 'External'>
- NULL = <ConnectionPortalType.NULL: 'Null'>
- IMPORT_LITE = <ConnectionPortalType.IMPORT_LITE: 'ImportLite'>
- HTTP = <ConnectionPortalType.HTTP: 'Http'>
- CATALOG = <ConnectionPortalType.CATALOG: 'Namara'>
- class tdw_catalog.utils.CurrencyFieldValue(value: float, currency: str)[source]
Bases:
objectCurrencyFieldValuemodels the value of a currency fieldAttributes
- valuefloat
The currency value
- currencystr
The specific currency to which the value belongs
- class tdw_catalog.utils.Filter(limit: int = None, offset: int = None)[source]
Bases:
LegacyFilterFilterdescribes the ways in which results should be filtered and/or paginated. It is serialized in a new way vsLegacyFilterAttributes
- limitint, optional
Limits the number of results. Useful for pagination. (None by default)
- offsetint, optional
Offsets the result list by the given number of results. Useful for pagination. (None by default)
- class tdw_catalog.utils.FilterSort(field: str, order: FilterSortOrder = FilterSortOrder.ASC)[source]
Bases:
objectFilterSortdescribes a desired sort field and order for results.Attributes
- fieldstr
The field to sort by
- orderFilterSortOrder, optional
The order to sort in (FilterSortOrder.ASC by default)
- enum tdw_catalog.utils.FilterSortOrder(value)[source]
Bases:
EnumValid values are as follows:
- ASC = <FilterSortOrder.ASC: 1>
- DESC = <FilterSortOrder.DESC: 2>
- enum tdw_catalog.utils.ImportState(value)[source]
Bases:
StrEnumThe different possible states an imported dataset might occupy. Virtualized datasets will always show state
IMPORTED.- Member Type:
str
Valid values are as follows:
- IMPORTED = <ImportState.IMPORTED: 'imported'>
- IMPORTING = <ImportState.IMPORTING: 'importing'>
- QUEUED = <ImportState.QUEUED: 'queued'>
- FAILED = <ImportState.FAILED: 'failed'>
- class tdw_catalog.utils.LegacyFilter(limit: int = None, offset: int = None)[source]
Bases:
objectLegacyFilterdescribes the ways in which results should be filtered and/or paginatedAttributes
- limitint, optional
Limits the number of results. Useful for pagination. (None by default)
- offsetint, optional
Offsets the result list by the given number of results. Useful for pagination. (None by default)
- class tdw_catalog.utils.ListConnectionsFilter(limit: int = None, offset: int = None, organization_id: str | None = None, source_ids: List[str] | None = None, portals: List[ConnectionPortalType] | None = None)[source]
Bases:
LegacyFilterListConnectionsFilterfilters results according to Connection fieldsAttributes
- organization_idOptional[str]
Filters results by organization_id
- source_idsOptional[List[str]]
Filters results to the given source_id(s)
- portalsOptional[List[ConnectionPortalType]]
Filters results to the given
ConnectionPortalType(s)
- class tdw_catalog.utils.ListGlossaryTermsFilter(limit: int = None, offset: int = None, glossary_term_ids: List[str] | None = None)[source]
Bases:
FilterListGlossaryTermsFilterfilters results according toGlossaryTermidsAttributes
- glossary_term_idsOptional[List[str]]
Filters results to the given glossary_term_id(s)
- class tdw_catalog.utils.ListOrganizationsFilter(limit: int = None, offset: int = None, organization_ids: List[str] | None = None)[source]
Bases:
LegacyFilterListOrganizationsFilterfiltersOrganizationresults according to a set of provided idsAttributes
- organization_idsstr[], optional
Filters results according to a set of provided ids
- class tdw_catalog.utils.ListSourcesFilter(limit: int = None, offset: int = None, labels: str | None = None)[source]
Bases:
LegacyFilterListSourcesFilterfilters results according toSourcefieldsAttributes
- labelsOptional[str]
Filters results by label. This will match label substrings.
- enum tdw_catalog.utils.MetadataFieldType(value)[source]
Bases:
IntEnumThe different possible data types for values stored in MetadataFields and default values stored in MetadataTemplateFields
- Member Type:
int
Valid values are as follows:
- FT_STRING = <MetadataFieldType.FT_STRING: 0>
- FT_INTEGER = <MetadataFieldType.FT_INTEGER: 1>
- FT_DECIMAL = <MetadataFieldType.FT_DECIMAL: 2>
- FT_DATE = <MetadataFieldType.FT_DATE: 3>
- FT_DATETIME = <MetadataFieldType.FT_DATETIME: 4>
- FT_DATASET = <MetadataFieldType.FT_DATASET: 5>
- FT_URL = <MetadataFieldType.FT_URL: 6>
- FT_USER = <MetadataFieldType.FT_USER: 7>
- FT_ATTACHMENT = <MetadataFieldType.FT_ATTACHMENT: 8>
- FT_LIST = <MetadataFieldType.FT_LIST: 9>
- FT_CURRENCY = <MetadataFieldType.FT_CURRENCY: 10>
- FT_TEAM = <MetadataFieldType.FT_TEAM: 11>
- FT_ALIAS = <MetadataFieldType.FT_ALIAS: 12>
- class tdw_catalog.utils.QueryFilter(limit: int = None, offset: int = None, sort: FilterSort = None, query: str | None = None)[source]
Bases:
SortableFilterQueryFilterfilters results according to a NiQL queryAttributes
- querystr, optional
Filters results according to a NiQL query
- class tdw_catalog.utils.SortableFilter(limit: int = None, offset: int = None, sort: FilterSort = None)[source]
Bases:
LegacyFilterSortableFilterdescribes the ways in which results should be filtered, paginated and/or sorted.Attributes
- limitint, optional
Limits the number of results. Useful for pagination. (None by default)
- offsetint, optional
Offsets the result list by the given number of results. Useful for pagination. (None by default)
- sortFilterSort, optional
Specifies a desired sort field and order for results (None by default).