View Source aws_glue (aws v1.0.4)

Glue

Defines the public endpoint for the Glue service.

Summary

Functions

Creates one or more partitions in a batch operation.
Deletes a list of connection definitions from the Data Catalog.
Deletes one or more partitions in a batch operation.

Deletes multiple tables at once.

Deletes a specified batch of versions of a table.
Retrieves information about a list of blueprints.

Returns a list of resource metadata for a given list of crawler names.

Retrieves the details for the custom patterns specified by a list of names.
Retrieves a list of data quality results for the specified result IDs.

Returns a list of resource metadata for a given list of development endpoint names.

Returns a list of resource metadata for a given list of job names.

Retrieves partitions in a batch request.
Returns the configuration for the specified table optimizers.

Returns a list of resource metadata for a given list of trigger names.

Returns a list of resource metadata for a given list of workflow names.

Stops one or more job runs for a specified job definition.
Updates one or more partitions in a batch operation.
Cancels the specified recommendation run that was being used to generate rules.
Cancels a run where a ruleset is being evaluated against a data source.

Cancels (stops) a task run.

Cancels the statement.

Validates the supplied schema.

Registers a blueprint with Glue.

Creates a classifier in the user's account.

Creates a connection definition in the Data Catalog.

Creates a new crawler with specified targets, role, configuration, and optional schedule.

Creates a custom pattern that is used to detect sensitive data across the columns and rows of your structured data.

Creates a data quality ruleset with DQDL rules applied to a specified Glue table.

Creates a new database in a Data Catalog.
Creates a new development endpoint.
Creates a new job definition.

Creates an Glue machine learning transform.

Creates a new partition.
Creates a specified partition index in an existing table.
Creates a new registry which may be used to hold a collection of schemas.

Creates a new schema set and registers the schema definition.

Transforms a directed acyclic graph (DAG) into code.

Creates a new security configuration.

Creates a new session.
Creates a new table definition in the Data Catalog.

Creates a new table optimizer for a specific function.

Creates a new trigger.
Creates a new function definition in the Data Catalog.
Creates a new workflow.
Deletes an existing blueprint.
Removes a classifier from the Data Catalog.

Delete the partition column statistics of a column.

Retrieves table statistics of columns.

Deletes a connection from the Data Catalog.
Removes a specified crawler from the Glue Data Catalog, unless the crawler state is RUNNING.
Deletes a custom pattern by specifying its name.
Deletes a data quality ruleset.

Removes a specified database from a Data Catalog.

Deletes a specified development endpoint.

Deletes a specified job definition.

Deletes an Glue machine learning transform.

Deletes a specified partition.
Deletes a specified partition index from an existing table.

Delete the entire registry including schema and all of its versions.

Deletes a specified policy.

Deletes the entire schema set, including the schema set and all of its versions.

Remove versions from the specified schema.

Deletes a specified security configuration.
Deletes the session.

Removes a table definition from the Data Catalog.

Deletes an optimizer and all associated metadata for a table.

Deletes a specified version of a table.

Deletes a specified trigger.

Deletes an existing function definition from the Data Catalog.
Deletes a workflow.
Retrieves the details of a blueprint.
Retrieves the details of a blueprint run.
Retrieves the details of blueprint runs for a specified blueprint.
Retrieves the status of a migration operation.
Retrieve a classifier by name.
Lists all classifier objects in the Data Catalog.

Retrieves partition statistics of columns.

Retrieves table statistics of columns.

Get the associated metadata/information for a task run, given a task run ID.
Retrieves information about all runs associated with the specified table.
Retrieves a connection definition from the Data Catalog.
Retrieves a list of connection definitions from the Data Catalog.
Retrieves metadata for a specified crawler.
Retrieves metrics about specified crawlers.
Retrieves metadata for all crawlers defined in the customer account.
Retrieves the details of a custom pattern by specifying its name.
Retrieves the security configuration for a specified catalog.
Retrieves the result of a data quality rule evaluation.
Gets the specified recommendation run that was used to generate rules.
Returns an existing ruleset by identifier or name.
Retrieves a specific run where a ruleset is evaluated against a data source.
Retrieves the definition of a specified database.
Retrieves all databases defined in a given Data Catalog.
Transforms a Python script into a directed acyclic graph (DAG).

Retrieves information about a specified development endpoint.

Retrieves all the development endpoints in this Amazon Web Services account.

Retrieves an existing job definition.

Returns information on a job bookmark entry.

Retrieves the metadata for a given job run.
Retrieves metadata for all runs of a given job definition.
Retrieves all current job definitions.
Creates mappings.

Gets details for a specific task run on a machine learning transform.

Gets a list of runs for a machine learning transform.

Gets an Glue machine learning transform artifact and all its corresponding metadata.

Gets a sortable, filterable list of existing Glue machine learning transforms.

Retrieves information about a specified partition.
Retrieves the partition indexes associated with a table.
Retrieves information about the partitions in a table.
Gets code to perform a specified mapping.
Describes the specified registry in detail.

Retrieves the resource policies set on individual resources by Resource Access Manager during cross-account permission grants.

Retrieves a specified resource policy.
Describes the specified schema in detail.

Retrieves a schema by the SchemaDefinition.

Get the specified schema by its unique ID assigned when a version of the schema is created or registered.

Fetches the schema version difference in the specified difference type between two stored schema versions in the Schema Registry.

Retrieves a specified security configuration.
Retrieves a list of all security configurations.
Retrieves the session.
Retrieves the statement.
Retrieves the Table definition in a Data Catalog for a specified table.
Returns the configuration of all optimizers associated with a specified table.
Retrieves a specified version of a table.
Retrieves a list of strings that identify available versions of a specified table.
Retrieves the definitions of some or all of the tables in a given Database.
Retrieves a list of tags associated with a resource.
Retrieves the definition of a trigger.
Gets all the triggers associated with a job.

Retrieves partition metadata from the Data Catalog that contains unfiltered metadata.

Retrieves partition metadata from the Data Catalog that contains unfiltered metadata.

Retrieves table metadata from the Data Catalog that contains unfiltered metadata.

Retrieves a specified function definition from the Data Catalog.
Retrieves multiple function definitions from the Data Catalog.
Retrieves resource metadata for a workflow.
Retrieves the metadata for a given workflow run.
Retrieves the workflow run properties which were set during the run.
Retrieves metadata for all runs of a given workflow.
Imports an existing Amazon Athena Data Catalog to Glue.
Lists all the blueprint names in an account.
List all task runs for a particular account.

Retrieves the names of all crawler resources in this Amazon Web Services account, or the resources with the specified tag.

Returns all the crawls of a specified crawler.

Lists all the custom patterns that have been created.
Returns all data quality execution results for your account.
Lists the recommendation runs meeting the filter criteria.
Lists all the runs meeting the filter criteria, where a ruleset is evaluated against a data source.
Returns a paginated list of rulesets for the specified list of Glue tables.

Retrieves the names of all DevEndpoint resources in this Amazon Web Services account, or the resources with the specified tag.

Retrieves the names of all job resources in this Amazon Web Services account, or the resources with the specified tag.

Retrieves a sortable, filterable list of existing Glue machine learning transforms in this Amazon Web Services account, or the resources with the specified tag.

Returns a list of registries that you have created, with minimal registry information.

Returns a list of schema versions that you have created, with minimal information.

Returns a list of schemas with minimal details.

Retrieve a list of sessions.
Lists statements for the session.
Lists the history of previous optimizer runs for a specific table.

Retrieves the names of all trigger resources in this Amazon Web Services account, or the resources with the specified tag.

Lists names of workflows created in the account.

Sets the security configuration for a specified catalog.

Sets the Data Catalog resource policy for access control.

Puts the metadata key value pair for a specified schema version ID.

Puts the specified workflow run properties for the given workflow run.

Queries for the schema version metadata information.

Adds a new version to the existing schema.

Removes a key value pair from the schema version metadata for the specified schema version ID.

Resets a bookmark entry.

Restarts selected nodes of a previous partially completed workflow run and resumes the workflow run.

Executes the statement.

Searches a set of tables based on properties in the table metadata as well as on the parent database.

Starts a new run of the specified blueprint.
Starts a column statistics task run, for a specified table and columns.

Starts a crawl using the specified crawler, regardless of what is scheduled.

Changes the schedule state of the specified crawler to SCHEDULED, unless the crawler is already running or the schedule state is already SCHEDULED.

Starts a recommendation run that is used to generate rules when you don't know what rules to write.

Once you have a ruleset definition (either recommended or your own), you call this operation to evaluate the ruleset against a data source (Glue table).

Begins an asynchronous task to export all labeled data for a particular transform.

Enables you to provide additional labels (examples of truth) to be used to teach the machine learning transform and improve its quality.

Starts a job run using a job definition.

Starts a task to estimate the quality of the transform.

Starts the active learning workflow for your machine learning transform to improve the transform's quality by generating label sets and adding labels.

Starts an existing trigger.

Starts a new run of the specified workflow.
Stops a task run for the specified table.
If the specified crawler is running, stops the crawl.
Sets the schedule state of the specified crawler to NOT_SCHEDULED, but does not stop the crawler if it is already running.
Stops the session.
Stops a specified trigger.
Stops the execution of the specified workflow run.

Adds tags to a resource.

Removes tags from a resource.
Updates a registered blueprint.
Modifies an existing classifier (a GrokClassifier, an XMLClassifier, a JsonClassifier, or a CsvClassifier, depending on which field is present).

Creates or updates partition statistics of columns.

Creates or updates table statistics of columns.

Updates a connection definition in the Data Catalog.

Updates a crawler.

Updates the schedule of a crawler using a cron expression.
Updates the specified data quality ruleset.
Updates an existing database definition in a Data Catalog.
Updates a specified development endpoint.

Updates an existing job definition.

Synchronizes a job from the source control repository.

Updates an existing machine learning transform.

Updates a partition.

Updates an existing registry which is used to hold a collection of schemas.

Updates the description, compatibility setting, or version checkpoint for a schema set.

Synchronizes a job to the source control repository.

Updates a metadata table in the Data Catalog.
Updates the configuration for an existing table optimizer.
Updates a trigger definition.
Updates an existing function definition in the Data Catalog.
Updates an existing workflow.

Functions

Link to this function

batch_create_partition(Client, Input)

View Source
Creates one or more partitions in a batch operation.
Link to this function

batch_create_partition(Client, Input, Options)

View Source
Link to this function

batch_delete_connection(Client, Input)

View Source
Deletes a list of connection definitions from the Data Catalog.
Link to this function

batch_delete_connection(Client, Input, Options)

View Source
Link to this function

batch_delete_partition(Client, Input)

View Source
Deletes one or more partitions in a batch operation.
Link to this function

batch_delete_partition(Client, Input, Options)

View Source
Link to this function

batch_delete_table(Client, Input)

View Source

Deletes multiple tables at once.

After completing this operation, you no longer have access to the table versions and partitions that belong to the deleted table. Glue deletes these "orphaned" resources asynchronously in a timely manner, at the discretion of the service.

To ensure the immediate deletion of all related resources, before calling BatchDeleteTable, use DeleteTableVersion or BatchDeleteTableVersion, and DeletePartition or BatchDeletePartition, to delete any resources that belong to the table.
Link to this function

batch_delete_table(Client, Input, Options)

View Source
Link to this function

batch_delete_table_version(Client, Input)

View Source
Deletes a specified batch of versions of a table.
Link to this function

batch_delete_table_version(Client, Input, Options)

View Source
Link to this function

batch_get_blueprints(Client, Input)

View Source
Retrieves information about a list of blueprints.
Link to this function

batch_get_blueprints(Client, Input, Options)

View Source
Link to this function

batch_get_crawlers(Client, Input)

View Source

Returns a list of resource metadata for a given list of crawler names.

After calling the ListCrawlers operation, you can call this operation to access the data to which you have been granted permissions. This operation supports all IAM permissions, including permission conditions that uses tags.
Link to this function

batch_get_crawlers(Client, Input, Options)

View Source
Link to this function

batch_get_custom_entity_types(Client, Input)

View Source
Retrieves the details for the custom patterns specified by a list of names.
Link to this function

batch_get_custom_entity_types(Client, Input, Options)

View Source
Link to this function

batch_get_data_quality_result(Client, Input)

View Source
Retrieves a list of data quality results for the specified result IDs.
Link to this function

batch_get_data_quality_result(Client, Input, Options)

View Source
Link to this function

batch_get_dev_endpoints(Client, Input)

View Source

Returns a list of resource metadata for a given list of development endpoint names.

After calling the ListDevEndpoints operation, you can call this operation to access the data to which you have been granted permissions. This operation supports all IAM permissions, including permission conditions that uses tags.
Link to this function

batch_get_dev_endpoints(Client, Input, Options)

View Source
Link to this function

batch_get_jobs(Client, Input)

View Source

Returns a list of resource metadata for a given list of job names.

After calling the ListJobs operation, you can call this operation to access the data to which you have been granted permissions. This operation supports all IAM permissions, including permission conditions that uses tags.
Link to this function

batch_get_jobs(Client, Input, Options)

View Source
Link to this function

batch_get_partition(Client, Input)

View Source
Retrieves partitions in a batch request.
Link to this function

batch_get_partition(Client, Input, Options)

View Source
Link to this function

batch_get_table_optimizer(Client, Input)

View Source
Returns the configuration for the specified table optimizers.
Link to this function

batch_get_table_optimizer(Client, Input, Options)

View Source
Link to this function

batch_get_triggers(Client, Input)

View Source

Returns a list of resource metadata for a given list of trigger names.

After calling the ListTriggers operation, you can call this operation to access the data to which you have been granted permissions. This operation supports all IAM permissions, including permission conditions that uses tags.
Link to this function

batch_get_triggers(Client, Input, Options)

View Source
Link to this function

batch_get_workflows(Client, Input)

View Source

Returns a list of resource metadata for a given list of workflow names.

After calling the ListWorkflows operation, you can call this operation to access the data to which you have been granted permissions. This operation supports all IAM permissions, including permission conditions that uses tags.
Link to this function

batch_get_workflows(Client, Input, Options)

View Source
Link to this function

batch_stop_job_run(Client, Input)

View Source
Stops one or more job runs for a specified job definition.
Link to this function

batch_stop_job_run(Client, Input, Options)

View Source
Link to this function

batch_update_partition(Client, Input)

View Source
Updates one or more partitions in a batch operation.
Link to this function

batch_update_partition(Client, Input, Options)

View Source
Link to this function

cancel_data_quality_rule_recommendation_run(Client, Input)

View Source
Cancels the specified recommendation run that was being used to generate rules.
Link to this function

cancel_data_quality_rule_recommendation_run(Client, Input, Options)

View Source
Link to this function

cancel_data_quality_ruleset_evaluation_run(Client, Input)

View Source
Cancels a run where a ruleset is being evaluated against a data source.
Link to this function

cancel_data_quality_ruleset_evaluation_run(Client, Input, Options)

View Source
Link to this function

cancel_ml_task_run(Client, Input)

View Source

Cancels (stops) a task run.

Machine learning task runs are asynchronous tasks that Glue runs on your behalf as part of various machine learning workflows. You can cancel a machine learning task run at any time by calling CancelMLTaskRun with a task run's parent transform's TransformID and the task run's TaskRunId.
Link to this function

cancel_ml_task_run(Client, Input, Options)

View Source
Link to this function

cancel_statement(Client, Input)

View Source
Cancels the statement.
Link to this function

cancel_statement(Client, Input, Options)

View Source
Link to this function

check_schema_version_validity(Client, Input)

View Source

Validates the supplied schema.

This call has no side effects, it simply validates using the supplied schema using DataFormat as the format. Since it does not take a schema set name, no compatibility checks are performed.
Link to this function

check_schema_version_validity(Client, Input, Options)

View Source
Link to this function

create_blueprint(Client, Input)

View Source
Registers a blueprint with Glue.
Link to this function

create_blueprint(Client, Input, Options)

View Source
Link to this function

create_classifier(Client, Input)

View Source

Creates a classifier in the user's account.

This can be a GrokClassifier, an XMLClassifier, a JsonClassifier, or a CsvClassifier, depending on which field of the request is present.
Link to this function

create_classifier(Client, Input, Options)

View Source
Link to this function

create_connection(Client, Input)

View Source

Creates a connection definition in the Data Catalog.

Connections used for creating federated resources require the IAM glue:PassConnection permission.
Link to this function

create_connection(Client, Input, Options)

View Source
Link to this function

create_crawler(Client, Input)

View Source

Creates a new crawler with specified targets, role, configuration, and optional schedule.

At least one crawl target must be specified, in the s3Targets field, the jdbcTargets field, or the DynamoDBTargets field.
Link to this function

create_crawler(Client, Input, Options)

View Source
Link to this function

create_custom_entity_type(Client, Input)

View Source

Creates a custom pattern that is used to detect sensitive data across the columns and rows of your structured data.

Each custom pattern you create specifies a regular expression and an optional list of context words. If no context words are passed only a regular expression is checked.
Link to this function

create_custom_entity_type(Client, Input, Options)

View Source
Link to this function

create_data_quality_ruleset(Client, Input)

View Source

Creates a data quality ruleset with DQDL rules applied to a specified Glue table.

You create the ruleset using the Data Quality Definition Language (DQDL). For more information, see the Glue developer guide.
Link to this function

create_data_quality_ruleset(Client, Input, Options)

View Source
Link to this function

create_database(Client, Input)

View Source
Creates a new database in a Data Catalog.
Link to this function

create_database(Client, Input, Options)

View Source
Link to this function

create_dev_endpoint(Client, Input)

View Source
Creates a new development endpoint.
Link to this function

create_dev_endpoint(Client, Input, Options)

View Source
Link to this function

create_job(Client, Input)

View Source
Creates a new job definition.
Link to this function

create_job(Client, Input, Options)

View Source
Link to this function

create_ml_transform(Client, Input)

View Source

Creates an Glue machine learning transform.

This operation creates the transform and all the necessary parameters to train it.

Call this operation as the first step in the process of using a machine learning transform (such as the FindMatches transform) for deduplicating data. You can provide an optional Description, in addition to the parameters that you want to use for your algorithm.

You must also specify certain parameters for the tasks that Glue runs on your behalf as part of learning from your data and creating a high-quality machine learning transform. These parameters include Role, and optionally, AllocatedCapacity, Timeout, and MaxRetries. For more information, see Jobs: https://docs.aws.amazon.com/glue/latest/dg/aws-glue-api-jobs-job.html.
Link to this function

create_ml_transform(Client, Input, Options)

View Source
Link to this function

create_partition(Client, Input)

View Source
Creates a new partition.
Link to this function

create_partition(Client, Input, Options)

View Source
Link to this function

create_partition_index(Client, Input)

View Source
Creates a specified partition index in an existing table.
Link to this function

create_partition_index(Client, Input, Options)

View Source
Link to this function

create_registry(Client, Input)

View Source
Creates a new registry which may be used to hold a collection of schemas.
Link to this function

create_registry(Client, Input, Options)

View Source
Link to this function

create_schema(Client, Input)

View Source

Creates a new schema set and registers the schema definition.

Returns an error if the schema set already exists without actually registering the version.

When the schema set is created, a version checkpoint will be set to the first version. Compatibility mode "DISABLED" restricts any additional schema versions from being added after the first schema version. For all other compatibility modes, validation of compatibility settings will be applied only from the second version onwards when the RegisterSchemaVersion API is used.

When this API is called without a RegistryId, this will create an entry for a "default-registry" in the registry database tables, if it is not already present.
Link to this function

create_schema(Client, Input, Options)

View Source
Link to this function

create_script(Client, Input)

View Source
Transforms a directed acyclic graph (DAG) into code.
Link to this function

create_script(Client, Input, Options)

View Source
Link to this function

create_security_configuration(Client, Input)

View Source

Creates a new security configuration.

A security configuration is a set of security properties that can be used by Glue. You can use a security configuration to encrypt data at rest. For information about using security configurations in Glue, see Encrypting Data Written by Crawlers, Jobs, and Development Endpoints: https://docs.aws.amazon.com/glue/latest/dg/encryption-security-configuration.html.
Link to this function

create_security_configuration(Client, Input, Options)

View Source
Link to this function

create_session(Client, Input)

View Source
Creates a new session.
Link to this function

create_session(Client, Input, Options)

View Source
Link to this function

create_table(Client, Input)

View Source
Creates a new table definition in the Data Catalog.
Link to this function

create_table(Client, Input, Options)

View Source
Link to this function

create_table_optimizer(Client, Input)

View Source

Creates a new table optimizer for a specific function.

compaction is the only currently supported optimizer type.
Link to this function

create_table_optimizer(Client, Input, Options)

View Source
Link to this function

create_trigger(Client, Input)

View Source
Creates a new trigger.
Link to this function

create_trigger(Client, Input, Options)

View Source
Link to this function

create_user_defined_function(Client, Input)

View Source
Creates a new function definition in the Data Catalog.
Link to this function

create_user_defined_function(Client, Input, Options)

View Source
Link to this function

create_workflow(Client, Input)

View Source
Creates a new workflow.
Link to this function

create_workflow(Client, Input, Options)

View Source
Link to this function

delete_blueprint(Client, Input)

View Source
Deletes an existing blueprint.
Link to this function

delete_blueprint(Client, Input, Options)

View Source
Link to this function

delete_classifier(Client, Input)

View Source
Removes a classifier from the Data Catalog.
Link to this function

delete_classifier(Client, Input, Options)

View Source
Link to this function

delete_column_statistics_for_partition(Client, Input)

View Source

Delete the partition column statistics of a column.

The Identity and Access Management (IAM) permission required for this operation is DeletePartition.
Link to this function

delete_column_statistics_for_partition(Client, Input, Options)

View Source
Link to this function

delete_column_statistics_for_table(Client, Input)

View Source

Retrieves table statistics of columns.

The Identity and Access Management (IAM) permission required for this operation is DeleteTable.
Link to this function

delete_column_statistics_for_table(Client, Input, Options)

View Source
Link to this function

delete_connection(Client, Input)

View Source
Deletes a connection from the Data Catalog.
Link to this function

delete_connection(Client, Input, Options)

View Source
Link to this function

delete_crawler(Client, Input)

View Source
Removes a specified crawler from the Glue Data Catalog, unless the crawler state is RUNNING.
Link to this function

delete_crawler(Client, Input, Options)

View Source
Link to this function

delete_custom_entity_type(Client, Input)

View Source
Deletes a custom pattern by specifying its name.
Link to this function

delete_custom_entity_type(Client, Input, Options)

View Source
Link to this function

delete_data_quality_ruleset(Client, Input)

View Source
Deletes a data quality ruleset.
Link to this function

delete_data_quality_ruleset(Client, Input, Options)

View Source
Link to this function

delete_database(Client, Input)

View Source

Removes a specified database from a Data Catalog.

After completing this operation, you no longer have access to the tables (and all table versions and partitions that might belong to the tables) and the user-defined functions in the deleted database. Glue deletes these "orphaned" resources asynchronously in a timely manner, at the discretion of the service.

To ensure the immediate deletion of all related resources, before calling DeleteDatabase, use DeleteTableVersion or BatchDeleteTableVersion, DeletePartition or BatchDeletePartition, DeleteUserDefinedFunction, and DeleteTable or BatchDeleteTable, to delete any resources that belong to the database.
Link to this function

delete_database(Client, Input, Options)

View Source
Link to this function

delete_dev_endpoint(Client, Input)

View Source
Deletes a specified development endpoint.
Link to this function

delete_dev_endpoint(Client, Input, Options)

View Source
Link to this function

delete_job(Client, Input)

View Source

Deletes a specified job definition.

If the job definition is not found, no exception is thrown.
Link to this function

delete_job(Client, Input, Options)

View Source
Link to this function

delete_ml_transform(Client, Input)

View Source

Deletes an Glue machine learning transform.

Machine learning transforms are a special type of transform that use machine learning to learn the details of the transformation to be performed by learning from examples provided by humans. These transformations are then saved by Glue. If you no longer need a transform, you can delete it by calling DeleteMLTransforms. However, any Glue jobs that still reference the deleted transform will no longer succeed.
Link to this function

delete_ml_transform(Client, Input, Options)

View Source
Link to this function

delete_partition(Client, Input)

View Source
Deletes a specified partition.
Link to this function

delete_partition(Client, Input, Options)

View Source
Link to this function

delete_partition_index(Client, Input)

View Source
Deletes a specified partition index from an existing table.
Link to this function

delete_partition_index(Client, Input, Options)

View Source
Link to this function

delete_registry(Client, Input)

View Source

Delete the entire registry including schema and all of its versions.

To get the status of the delete operation, you can call the GetRegistry API after the asynchronous call. Deleting a registry will deactivate all online operations for the registry such as the UpdateRegistry, CreateSchema, UpdateSchema, and RegisterSchemaVersion APIs.
Link to this function

delete_registry(Client, Input, Options)

View Source
Link to this function

delete_resource_policy(Client, Input)

View Source
Deletes a specified policy.
Link to this function

delete_resource_policy(Client, Input, Options)

View Source
Link to this function

delete_schema(Client, Input)

View Source

Deletes the entire schema set, including the schema set and all of its versions.

To get the status of the delete operation, you can call GetSchema API after the asynchronous call. Deleting a registry will deactivate all online operations for the schema, such as the GetSchemaByDefinition, and RegisterSchemaVersion APIs.
Link to this function

delete_schema(Client, Input, Options)

View Source
Link to this function

delete_schema_versions(Client, Input)

View Source

Remove versions from the specified schema.

A version number or range may be supplied. If the compatibility mode forbids deleting of a version that is necessary, such as BACKWARDS_FULL, an error is returned. Calling the GetSchemaVersions API after this call will list the status of the deleted versions.

When the range of version numbers contain check pointed version, the API will return a 409 conflict and will not proceed with the deletion. You have to remove the checkpoint first using the DeleteSchemaCheckpoint API before using this API.

You cannot use the DeleteSchemaVersions API to delete the first schema version in the schema set. The first schema version can only be deleted by the DeleteSchema API. This operation will also delete the attached SchemaVersionMetadata under the schema versions. Hard deletes will be enforced on the database.

If the compatibility mode forbids deleting of a version that is necessary, such as BACKWARDS_FULL, an error is returned.
Link to this function

delete_schema_versions(Client, Input, Options)

View Source
Link to this function

delete_security_configuration(Client, Input)

View Source
Deletes a specified security configuration.
Link to this function

delete_security_configuration(Client, Input, Options)

View Source
Link to this function

delete_session(Client, Input)

View Source
Deletes the session.
Link to this function

delete_session(Client, Input, Options)

View Source
Link to this function

delete_table(Client, Input)

View Source

Removes a table definition from the Data Catalog.

After completing this operation, you no longer have access to the table versions and partitions that belong to the deleted table. Glue deletes these "orphaned" resources asynchronously in a timely manner, at the discretion of the service.

To ensure the immediate deletion of all related resources, before calling DeleteTable, use DeleteTableVersion or BatchDeleteTableVersion, and DeletePartition or BatchDeletePartition, to delete any resources that belong to the table.
Link to this function

delete_table(Client, Input, Options)

View Source
Link to this function

delete_table_optimizer(Client, Input)

View Source

Deletes an optimizer and all associated metadata for a table.

The optimization will no longer be performed on the table.
Link to this function

delete_table_optimizer(Client, Input, Options)

View Source
Link to this function

delete_table_version(Client, Input)

View Source
Deletes a specified version of a table.
Link to this function

delete_table_version(Client, Input, Options)

View Source
Link to this function

delete_trigger(Client, Input)

View Source

Deletes a specified trigger.

If the trigger is not found, no exception is thrown.
Link to this function

delete_trigger(Client, Input, Options)

View Source
Link to this function

delete_user_defined_function(Client, Input)

View Source
Deletes an existing function definition from the Data Catalog.
Link to this function

delete_user_defined_function(Client, Input, Options)

View Source
Link to this function

delete_workflow(Client, Input)

View Source
Deletes a workflow.
Link to this function

delete_workflow(Client, Input, Options)

View Source
Link to this function

get_blueprint(Client, Input)

View Source
Retrieves the details of a blueprint.
Link to this function

get_blueprint(Client, Input, Options)

View Source
Link to this function

get_blueprint_run(Client, Input)

View Source
Retrieves the details of a blueprint run.
Link to this function

get_blueprint_run(Client, Input, Options)

View Source
Link to this function

get_blueprint_runs(Client, Input)

View Source
Retrieves the details of blueprint runs for a specified blueprint.
Link to this function

get_blueprint_runs(Client, Input, Options)

View Source
Link to this function

get_catalog_import_status(Client, Input)

View Source
Retrieves the status of a migration operation.
Link to this function

get_catalog_import_status(Client, Input, Options)

View Source
Link to this function

get_classifier(Client, Input)

View Source
Retrieve a classifier by name.
Link to this function

get_classifier(Client, Input, Options)

View Source
Link to this function

get_classifiers(Client, Input)

View Source
Lists all classifier objects in the Data Catalog.
Link to this function

get_classifiers(Client, Input, Options)

View Source
Link to this function

get_column_statistics_for_partition(Client, Input)

View Source

Retrieves partition statistics of columns.

The Identity and Access Management (IAM) permission required for this operation is GetPartition.
Link to this function

get_column_statistics_for_partition(Client, Input, Options)

View Source
Link to this function

get_column_statistics_for_table(Client, Input)

View Source

Retrieves table statistics of columns.

The Identity and Access Management (IAM) permission required for this operation is GetTable.
Link to this function

get_column_statistics_for_table(Client, Input, Options)

View Source
Link to this function

get_column_statistics_task_run(Client, Input)

View Source
Get the associated metadata/information for a task run, given a task run ID.
Link to this function

get_column_statistics_task_run(Client, Input, Options)

View Source
Link to this function

get_column_statistics_task_runs(Client, Input)

View Source
Retrieves information about all runs associated with the specified table.
Link to this function

get_column_statistics_task_runs(Client, Input, Options)

View Source
Link to this function

get_connection(Client, Input)

View Source
Retrieves a connection definition from the Data Catalog.
Link to this function

get_connection(Client, Input, Options)

View Source
Link to this function

get_connections(Client, Input)

View Source
Retrieves a list of connection definitions from the Data Catalog.
Link to this function

get_connections(Client, Input, Options)

View Source
Link to this function

get_crawler(Client, Input)

View Source
Retrieves metadata for a specified crawler.
Link to this function

get_crawler(Client, Input, Options)

View Source
Link to this function

get_crawler_metrics(Client, Input)

View Source
Retrieves metrics about specified crawlers.
Link to this function

get_crawler_metrics(Client, Input, Options)

View Source
Link to this function

get_crawlers(Client, Input)

View Source
Retrieves metadata for all crawlers defined in the customer account.
Link to this function

get_crawlers(Client, Input, Options)

View Source
Link to this function

get_custom_entity_type(Client, Input)

View Source
Retrieves the details of a custom pattern by specifying its name.
Link to this function

get_custom_entity_type(Client, Input, Options)

View Source
Link to this function

get_data_catalog_encryption_settings(Client, Input)

View Source
Retrieves the security configuration for a specified catalog.
Link to this function

get_data_catalog_encryption_settings(Client, Input, Options)

View Source
Link to this function

get_data_quality_result(Client, Input)

View Source
Retrieves the result of a data quality rule evaluation.
Link to this function

get_data_quality_result(Client, Input, Options)

View Source
Link to this function

get_data_quality_rule_recommendation_run(Client, Input)

View Source
Gets the specified recommendation run that was used to generate rules.
Link to this function

get_data_quality_rule_recommendation_run(Client, Input, Options)

View Source
Link to this function

get_data_quality_ruleset(Client, Input)

View Source
Returns an existing ruleset by identifier or name.
Link to this function

get_data_quality_ruleset(Client, Input, Options)

View Source
Link to this function

get_data_quality_ruleset_evaluation_run(Client, Input)

View Source
Retrieves a specific run where a ruleset is evaluated against a data source.
Link to this function

get_data_quality_ruleset_evaluation_run(Client, Input, Options)

View Source
Link to this function

get_database(Client, Input)

View Source
Retrieves the definition of a specified database.
Link to this function

get_database(Client, Input, Options)

View Source
Link to this function

get_databases(Client, Input)

View Source
Retrieves all databases defined in a given Data Catalog.
Link to this function

get_databases(Client, Input, Options)

View Source
Link to this function

get_dataflow_graph(Client, Input)

View Source
Transforms a Python script into a directed acyclic graph (DAG).
Link to this function

get_dataflow_graph(Client, Input, Options)

View Source
Link to this function

get_dev_endpoint(Client, Input)

View Source

Retrieves information about a specified development endpoint.

When you create a development endpoint in a virtual private cloud (VPC), Glue returns only a private IP address, and the public IP address field is not populated. When you create a non-VPC development endpoint, Glue returns only a public IP address.
Link to this function

get_dev_endpoint(Client, Input, Options)

View Source
Link to this function

get_dev_endpoints(Client, Input)

View Source

Retrieves all the development endpoints in this Amazon Web Services account.

When you create a development endpoint in a virtual private cloud (VPC), Glue returns only a private IP address and the public IP address field is not populated. When you create a non-VPC development endpoint, Glue returns only a public IP address.
Link to this function

get_dev_endpoints(Client, Input, Options)

View Source
Retrieves an existing job definition.
Link to this function

get_job(Client, Input, Options)

View Source
Link to this function

get_job_bookmark(Client, Input)

View Source

Returns information on a job bookmark entry.

For more information about enabling and using job bookmarks, see:

  • Tracking processed data using job bookmarks: https://docs.aws.amazon.com/glue/latest/dg/monitor-continuations.html

  • Job parameters used by Glue: https://docs.aws.amazon.com/glue/latest/dg/aws-glue-programming-etl-glue-arguments.html

  • Job structure: https://docs.aws.amazon.com/glue/latest/dg/aws-glue-api-jobs-job.html#aws-glue-api-jobs-job-Job

Link to this function

get_job_bookmark(Client, Input, Options)

View Source
Link to this function

get_job_run(Client, Input)

View Source
Retrieves the metadata for a given job run.
Link to this function

get_job_run(Client, Input, Options)

View Source
Link to this function

get_job_runs(Client, Input)

View Source
Retrieves metadata for all runs of a given job definition.
Link to this function

get_job_runs(Client, Input, Options)

View Source
Retrieves all current job definitions.
Link to this function

get_jobs(Client, Input, Options)

View Source
Link to this function

get_mapping(Client, Input)

View Source
Creates mappings.
Link to this function

get_mapping(Client, Input, Options)

View Source
Link to this function

get_ml_task_run(Client, Input)

View Source

Gets details for a specific task run on a machine learning transform.

Machine learning task runs are asynchronous tasks that Glue runs on your behalf as part of various machine learning workflows. You can check the stats of any task run by calling GetMLTaskRun with the TaskRunID and its parent transform's TransformID.
Link to this function

get_ml_task_run(Client, Input, Options)

View Source
Link to this function

get_ml_task_runs(Client, Input)

View Source

Gets a list of runs for a machine learning transform.

Machine learning task runs are asynchronous tasks that Glue runs on your behalf as part of various machine learning workflows. You can get a sortable, filterable list of machine learning task runs by calling GetMLTaskRuns with their parent transform's TransformID and other optional parameters as documented in this section.

This operation returns a list of historic runs and must be paginated.
Link to this function

get_ml_task_runs(Client, Input, Options)

View Source
Link to this function

get_ml_transform(Client, Input)

View Source

Gets an Glue machine learning transform artifact and all its corresponding metadata.

Machine learning transforms are a special type of transform that use machine learning to learn the details of the transformation to be performed by learning from examples provided by humans. These transformations are then saved by Glue. You can retrieve their metadata by calling GetMLTransform.
Link to this function

get_ml_transform(Client, Input, Options)

View Source
Link to this function

get_ml_transforms(Client, Input)

View Source

Gets a sortable, filterable list of existing Glue machine learning transforms.

Machine learning transforms are a special type of transform that use machine learning to learn the details of the transformation to be performed by learning from examples provided by humans. These transformations are then saved by Glue, and you can retrieve their metadata by calling GetMLTransforms.
Link to this function

get_ml_transforms(Client, Input, Options)

View Source
Link to this function

get_partition(Client, Input)

View Source
Retrieves information about a specified partition.
Link to this function

get_partition(Client, Input, Options)

View Source
Link to this function

get_partition_indexes(Client, Input)

View Source
Retrieves the partition indexes associated with a table.
Link to this function

get_partition_indexes(Client, Input, Options)

View Source
Link to this function

get_partitions(Client, Input)

View Source
Retrieves information about the partitions in a table.
Link to this function

get_partitions(Client, Input, Options)

View Source
Gets code to perform a specified mapping.
Link to this function

get_plan(Client, Input, Options)

View Source
Link to this function

get_registry(Client, Input)

View Source
Describes the specified registry in detail.
Link to this function

get_registry(Client, Input, Options)

View Source
Link to this function

get_resource_policies(Client, Input)

View Source

Retrieves the resource policies set on individual resources by Resource Access Manager during cross-account permission grants.

Also retrieves the Data Catalog resource policy.

If you enabled metadata encryption in Data Catalog settings, and you do not have permission on the KMS key, the operation can't return the Data Catalog resource policy.
Link to this function

get_resource_policies(Client, Input, Options)

View Source
Link to this function

get_resource_policy(Client, Input)

View Source
Retrieves a specified resource policy.
Link to this function

get_resource_policy(Client, Input, Options)

View Source
Link to this function

get_schema(Client, Input)

View Source
Describes the specified schema in detail.
Link to this function

get_schema(Client, Input, Options)

View Source
Link to this function

get_schema_by_definition(Client, Input)

View Source

Retrieves a schema by the SchemaDefinition.

The schema definition is sent to the Schema Registry, canonicalized, and hashed. If the hash is matched within the scope of the SchemaName or ARN (or the default registry, if none is supplied), that schema’s metadata is returned. Otherwise, a 404 or NotFound error is returned. Schema versions in Deleted statuses will not be included in the results.
Link to this function

get_schema_by_definition(Client, Input, Options)

View Source
Link to this function

get_schema_version(Client, Input)

View Source

Get the specified schema by its unique ID assigned when a version of the schema is created or registered.

Schema versions in Deleted status will not be included in the results.
Link to this function

get_schema_version(Client, Input, Options)

View Source
Link to this function

get_schema_versions_diff(Client, Input)

View Source

Fetches the schema version difference in the specified difference type between two stored schema versions in the Schema Registry.

This API allows you to compare two schema versions between two schema definitions under the same schema.
Link to this function

get_schema_versions_diff(Client, Input, Options)

View Source
Link to this function

get_security_configuration(Client, Input)

View Source
Retrieves a specified security configuration.
Link to this function

get_security_configuration(Client, Input, Options)

View Source
Link to this function

get_security_configurations(Client, Input)

View Source
Retrieves a list of all security configurations.
Link to this function

get_security_configurations(Client, Input, Options)

View Source
Link to this function

get_session(Client, Input)

View Source
Retrieves the session.
Link to this function

get_session(Client, Input, Options)

View Source
Link to this function

get_statement(Client, Input)

View Source
Retrieves the statement.
Link to this function

get_statement(Client, Input, Options)

View Source
Link to this function

get_table(Client, Input)

View Source
Retrieves the Table definition in a Data Catalog for a specified table.
Link to this function

get_table(Client, Input, Options)

View Source
Link to this function

get_table_optimizer(Client, Input)

View Source
Returns the configuration of all optimizers associated with a specified table.
Link to this function

get_table_optimizer(Client, Input, Options)

View Source
Link to this function

get_table_version(Client, Input)

View Source
Retrieves a specified version of a table.
Link to this function

get_table_version(Client, Input, Options)

View Source
Link to this function

get_table_versions(Client, Input)

View Source
Retrieves a list of strings that identify available versions of a specified table.
Link to this function

get_table_versions(Client, Input, Options)

View Source
Link to this function

get_tables(Client, Input)

View Source
Retrieves the definitions of some or all of the tables in a given Database.
Link to this function

get_tables(Client, Input, Options)

View Source
Retrieves a list of tags associated with a resource.
Link to this function

get_tags(Client, Input, Options)

View Source
Link to this function

get_trigger(Client, Input)

View Source
Retrieves the definition of a trigger.
Link to this function

get_trigger(Client, Input, Options)

View Source
Link to this function

get_triggers(Client, Input)

View Source
Gets all the triggers associated with a job.
Link to this function

get_triggers(Client, Input, Options)

View Source
Link to this function

get_unfiltered_partition_metadata(Client, Input)

View Source

Retrieves partition metadata from the Data Catalog that contains unfiltered metadata.

For IAM authorization, the public IAM action associated with this API is glue:GetPartition.
Link to this function

get_unfiltered_partition_metadata(Client, Input, Options)

View Source
Link to this function

get_unfiltered_partitions_metadata(Client, Input)

View Source

Retrieves partition metadata from the Data Catalog that contains unfiltered metadata.

For IAM authorization, the public IAM action associated with this API is glue:GetPartitions.
Link to this function

get_unfiltered_partitions_metadata(Client, Input, Options)

View Source
Link to this function

get_unfiltered_table_metadata(Client, Input)

View Source

Retrieves table metadata from the Data Catalog that contains unfiltered metadata.

For IAM authorization, the public IAM action associated with this API is glue:GetTable.
Link to this function

get_unfiltered_table_metadata(Client, Input, Options)

View Source
Link to this function

get_user_defined_function(Client, Input)

View Source
Retrieves a specified function definition from the Data Catalog.
Link to this function

get_user_defined_function(Client, Input, Options)

View Source
Link to this function

get_user_defined_functions(Client, Input)

View Source
Retrieves multiple function definitions from the Data Catalog.
Link to this function

get_user_defined_functions(Client, Input, Options)

View Source
Link to this function

get_workflow(Client, Input)

View Source
Retrieves resource metadata for a workflow.
Link to this function

get_workflow(Client, Input, Options)

View Source
Link to this function

get_workflow_run(Client, Input)

View Source
Retrieves the metadata for a given workflow run.
Link to this function

get_workflow_run(Client, Input, Options)

View Source
Link to this function

get_workflow_run_properties(Client, Input)

View Source
Retrieves the workflow run properties which were set during the run.
Link to this function

get_workflow_run_properties(Client, Input, Options)

View Source
Link to this function

get_workflow_runs(Client, Input)

View Source
Retrieves metadata for all runs of a given workflow.
Link to this function

get_workflow_runs(Client, Input, Options)

View Source
Link to this function

import_catalog_to_glue(Client, Input)

View Source
Imports an existing Amazon Athena Data Catalog to Glue.
Link to this function

import_catalog_to_glue(Client, Input, Options)

View Source
Link to this function

list_blueprints(Client, Input)

View Source
Lists all the blueprint names in an account.
Link to this function

list_blueprints(Client, Input, Options)

View Source
Link to this function

list_column_statistics_task_runs(Client, Input)

View Source
List all task runs for a particular account.
Link to this function

list_column_statistics_task_runs(Client, Input, Options)

View Source
Link to this function

list_crawlers(Client, Input)

View Source

Retrieves the names of all crawler resources in this Amazon Web Services account, or the resources with the specified tag.

This operation allows you to see which resources are available in your account, and their names.

This operation takes the optional Tags field, which you can use as a filter on the response so that tagged resources can be retrieved as a group. If you choose to use tags filtering, only resources with the tag are retrieved.
Link to this function

list_crawlers(Client, Input, Options)

View Source
Link to this function

list_crawls(Client, Input)

View Source

Returns all the crawls of a specified crawler.

Returns only the crawls that have occurred since the launch date of the crawler history feature, and only retains up to 12 months of crawls. Older crawls will not be returned.

You may use this API to:

  • Retrive all the crawls of a specified crawler.

  • Retrieve all the crawls of a specified crawler within a limited count.

  • Retrieve all the crawls of a specified crawler in a specific time range.

  • Retrieve all the crawls of a specified crawler with a particular state, crawl ID, or DPU hour value.

Link to this function

list_crawls(Client, Input, Options)

View Source
Link to this function

list_custom_entity_types(Client, Input)

View Source
Lists all the custom patterns that have been created.
Link to this function

list_custom_entity_types(Client, Input, Options)

View Source
Link to this function

list_data_quality_results(Client, Input)

View Source
Returns all data quality execution results for your account.
Link to this function

list_data_quality_results(Client, Input, Options)

View Source
Link to this function

list_data_quality_rule_recommendation_runs(Client, Input)

View Source
Lists the recommendation runs meeting the filter criteria.
Link to this function

list_data_quality_rule_recommendation_runs(Client, Input, Options)

View Source
Link to this function

list_data_quality_ruleset_evaluation_runs(Client, Input)

View Source
Lists all the runs meeting the filter criteria, where a ruleset is evaluated against a data source.
Link to this function

list_data_quality_ruleset_evaluation_runs(Client, Input, Options)

View Source
Link to this function

list_data_quality_rulesets(Client, Input)

View Source
Returns a paginated list of rulesets for the specified list of Glue tables.
Link to this function

list_data_quality_rulesets(Client, Input, Options)

View Source
Link to this function

list_dev_endpoints(Client, Input)

View Source

Retrieves the names of all DevEndpoint resources in this Amazon Web Services account, or the resources with the specified tag.

This operation allows you to see which resources are available in your account, and their names.

This operation takes the optional Tags field, which you can use as a filter on the response so that tagged resources can be retrieved as a group. If you choose to use tags filtering, only resources with the tag are retrieved.
Link to this function

list_dev_endpoints(Client, Input, Options)

View Source
Link to this function

list_jobs(Client, Input)

View Source

Retrieves the names of all job resources in this Amazon Web Services account, or the resources with the specified tag.

This operation allows you to see which resources are available in your account, and their names.

This operation takes the optional Tags field, which you can use as a filter on the response so that tagged resources can be retrieved as a group. If you choose to use tags filtering, only resources with the tag are retrieved.
Link to this function

list_jobs(Client, Input, Options)

View Source
Link to this function

list_ml_transforms(Client, Input)

View Source

Retrieves a sortable, filterable list of existing Glue machine learning transforms in this Amazon Web Services account, or the resources with the specified tag.

This operation takes the optional Tags field, which you can use as a filter of the responses so that tagged resources can be retrieved as a group. If you choose to use tag filtering, only resources with the tags are retrieved.
Link to this function

list_ml_transforms(Client, Input, Options)

View Source
Link to this function

list_registries(Client, Input)

View Source

Returns a list of registries that you have created, with minimal registry information.

Registries in the Deleting status will not be included in the results. Empty results will be returned if there are no registries available.
Link to this function

list_registries(Client, Input, Options)

View Source
Link to this function

list_schema_versions(Client, Input)

View Source

Returns a list of schema versions that you have created, with minimal information.

Schema versions in Deleted status will not be included in the results. Empty results will be returned if there are no schema versions available.
Link to this function

list_schema_versions(Client, Input, Options)

View Source
Link to this function

list_schemas(Client, Input)

View Source

Returns a list of schemas with minimal details.

Schemas in Deleting status will not be included in the results. Empty results will be returned if there are no schemas available.

When the RegistryId is not provided, all the schemas across registries will be part of the API response.
Link to this function

list_schemas(Client, Input, Options)

View Source
Link to this function

list_sessions(Client, Input)

View Source
Retrieve a list of sessions.
Link to this function

list_sessions(Client, Input, Options)

View Source
Link to this function

list_statements(Client, Input)

View Source
Lists statements for the session.
Link to this function

list_statements(Client, Input, Options)

View Source
Link to this function

list_table_optimizer_runs(Client, Input)

View Source
Lists the history of previous optimizer runs for a specific table.
Link to this function

list_table_optimizer_runs(Client, Input, Options)

View Source
Link to this function

list_triggers(Client, Input)

View Source

Retrieves the names of all trigger resources in this Amazon Web Services account, or the resources with the specified tag.

This operation allows you to see which resources are available in your account, and their names.

This operation takes the optional Tags field, which you can use as a filter on the response so that tagged resources can be retrieved as a group. If you choose to use tags filtering, only resources with the tag are retrieved.
Link to this function

list_triggers(Client, Input, Options)

View Source
Link to this function

list_workflows(Client, Input)

View Source
Lists names of workflows created in the account.
Link to this function

list_workflows(Client, Input, Options)

View Source
Link to this function

put_data_catalog_encryption_settings(Client, Input)

View Source

Sets the security configuration for a specified catalog.

After the configuration has been set, the specified encryption is applied to every catalog write thereafter.
Link to this function

put_data_catalog_encryption_settings(Client, Input, Options)

View Source
Link to this function

put_resource_policy(Client, Input)

View Source
Sets the Data Catalog resource policy for access control.
Link to this function

put_resource_policy(Client, Input, Options)

View Source
Link to this function

put_schema_version_metadata(Client, Input)

View Source

Puts the metadata key value pair for a specified schema version ID.

A maximum of 10 key value pairs will be allowed per schema version. They can be added over one or more calls.
Link to this function

put_schema_version_metadata(Client, Input, Options)

View Source
Link to this function

put_workflow_run_properties(Client, Input)

View Source

Puts the specified workflow run properties for the given workflow run.

If a property already exists for the specified run, then it overrides the value otherwise adds the property to existing properties.
Link to this function

put_workflow_run_properties(Client, Input, Options)

View Source
Link to this function

query_schema_version_metadata(Client, Input)

View Source
Queries for the schema version metadata information.
Link to this function

query_schema_version_metadata(Client, Input, Options)

View Source
Link to this function

register_schema_version(Client, Input)

View Source

Adds a new version to the existing schema.

Returns an error if new version of schema does not meet the compatibility requirements of the schema set. This API will not create a new schema set and will return a 404 error if the schema set is not already present in the Schema Registry.

If this is the first schema definition to be registered in the Schema Registry, this API will store the schema version and return immediately. Otherwise, this call has the potential to run longer than other operations due to compatibility modes. You can call the GetSchemaVersion API with the SchemaVersionId to check compatibility modes.

If the same schema definition is already stored in Schema Registry as a version, the schema ID of the existing schema is returned to the caller.
Link to this function

register_schema_version(Client, Input, Options)

View Source
Link to this function

remove_schema_version_metadata(Client, Input)

View Source
Removes a key value pair from the schema version metadata for the specified schema version ID.
Link to this function

remove_schema_version_metadata(Client, Input, Options)

View Source
Link to this function

reset_job_bookmark(Client, Input)

View Source

Resets a bookmark entry.

For more information about enabling and using job bookmarks, see:

  • Tracking processed data using job bookmarks: https://docs.aws.amazon.com/glue/latest/dg/monitor-continuations.html

  • Job parameters used by Glue: https://docs.aws.amazon.com/glue/latest/dg/aws-glue-programming-etl-glue-arguments.html

  • Job structure: https://docs.aws.amazon.com/glue/latest/dg/aws-glue-api-jobs-job.html#aws-glue-api-jobs-job-Job

Link to this function

reset_job_bookmark(Client, Input, Options)

View Source
Link to this function

resume_workflow_run(Client, Input)

View Source

Restarts selected nodes of a previous partially completed workflow run and resumes the workflow run.

The selected nodes and all nodes that are downstream from the selected nodes are run.
Link to this function

resume_workflow_run(Client, Input, Options)

View Source
Link to this function

run_statement(Client, Input)

View Source
Executes the statement.
Link to this function

run_statement(Client, Input, Options)

View Source
Link to this function

search_tables(Client, Input)

View Source

Searches a set of tables based on properties in the table metadata as well as on the parent database.

You can search against text or filter conditions.

You can only get tables that you have access to based on the security policies defined in Lake Formation. You need at least a read-only access to the table for it to be returned. If you do not have access to all the columns in the table, these columns will not be searched against when returning the list of tables back to you. If you have access to the columns but not the data in the columns, those columns and the associated metadata for those columns will be included in the search.
Link to this function

search_tables(Client, Input, Options)

View Source
Link to this function

start_blueprint_run(Client, Input)

View Source
Starts a new run of the specified blueprint.
Link to this function

start_blueprint_run(Client, Input, Options)

View Source
Link to this function

start_column_statistics_task_run(Client, Input)

View Source
Starts a column statistics task run, for a specified table and columns.
Link to this function

start_column_statistics_task_run(Client, Input, Options)

View Source
Link to this function

start_crawler(Client, Input)

View Source

Starts a crawl using the specified crawler, regardless of what is scheduled.

If the crawler is already running, returns a CrawlerRunningException: https://docs.aws.amazon.com/glue/latest/dg/aws-glue-api-exceptions.html#aws-glue-api-exceptions-CrawlerRunningException.
Link to this function

start_crawler(Client, Input, Options)

View Source
Link to this function

start_crawler_schedule(Client, Input)

View Source
Changes the schedule state of the specified crawler to SCHEDULED, unless the crawler is already running or the schedule state is already SCHEDULED.
Link to this function

start_crawler_schedule(Client, Input, Options)

View Source
Link to this function

start_data_quality_rule_recommendation_run(Client, Input)

View Source

Starts a recommendation run that is used to generate rules when you don't know what rules to write.

Glue Data Quality analyzes the data and comes up with recommendations for a potential ruleset. You can then triage the ruleset and modify the generated ruleset to your liking.

Recommendation runs are automatically deleted after 90 days.
Link to this function

start_data_quality_rule_recommendation_run(Client, Input, Options)

View Source
Link to this function

start_data_quality_ruleset_evaluation_run(Client, Input)

View Source

Once you have a ruleset definition (either recommended or your own), you call this operation to evaluate the ruleset against a data source (Glue table).

The evaluation computes results which you can retrieve with the GetDataQualityResult API.
Link to this function

start_data_quality_ruleset_evaluation_run(Client, Input, Options)

View Source
Link to this function

start_export_labels_task_run(Client, Input)

View Source

Begins an asynchronous task to export all labeled data for a particular transform.

This task is the only label-related API call that is not part of the typical active learning workflow. You typically use StartExportLabelsTaskRun when you want to work with all of your existing labels at the same time, such as when you want to remove or change labels that were previously submitted as truth. This API operation accepts the TransformId whose labels you want to export and an Amazon Simple Storage Service (Amazon S3) path to export the labels to. The operation returns a TaskRunId. You can check on the status of your task run by calling the GetMLTaskRun API.
Link to this function

start_export_labels_task_run(Client, Input, Options)

View Source
Link to this function

start_import_labels_task_run(Client, Input)

View Source

Enables you to provide additional labels (examples of truth) to be used to teach the machine learning transform and improve its quality.

This API operation is generally used as part of the active learning workflow that starts with the StartMLLabelingSetGenerationTaskRun call and that ultimately results in improving the quality of your machine learning transform.

After the StartMLLabelingSetGenerationTaskRun finishes, Glue machine learning will have generated a series of questions for humans to answer. (Answering these questions is often called 'labeling' in the machine learning workflows). In the case of the FindMatches transform, these questions are of the form, “What is the correct way to group these rows together into groups composed entirely of matching records?” After the labeling process is finished, users upload their answers/labels with a call to StartImportLabelsTaskRun. After StartImportLabelsTaskRun finishes, all future runs of the machine learning transform use the new and improved labels and perform a higher-quality transformation.

By default, StartMLLabelingSetGenerationTaskRun continually learns from and combines all labels that you upload unless you set Replace to true. If you set Replace to true, StartImportLabelsTaskRun deletes and forgets all previously uploaded labels and learns only from the exact set that you upload. Replacing labels can be helpful if you realize that you previously uploaded incorrect labels, and you believe that they are having a negative effect on your transform quality.

You can check on the status of your task run by calling the GetMLTaskRun operation.
Link to this function

start_import_labels_task_run(Client, Input, Options)

View Source
Link to this function

start_job_run(Client, Input)

View Source
Starts a job run using a job definition.
Link to this function

start_job_run(Client, Input, Options)

View Source
Link to this function

start_ml_evaluation_task_run(Client, Input)

View Source

Starts a task to estimate the quality of the transform.

When you provide label sets as examples of truth, Glue machine learning uses some of those examples to learn from them. The rest of the labels are used as a test to estimate quality.

Returns a unique identifier for the run. You can call GetMLTaskRun to get more information about the stats of the EvaluationTaskRun.
Link to this function

start_ml_evaluation_task_run(Client, Input, Options)

View Source
Link to this function

start_ml_labeling_set_generation_task_run(Client, Input)

View Source

Starts the active learning workflow for your machine learning transform to improve the transform's quality by generating label sets and adding labels.

When the StartMLLabelingSetGenerationTaskRun finishes, Glue will have generated a "labeling set" or a set of questions for humans to answer.

In the case of the FindMatches transform, these questions are of the form, “What is the correct way to group these rows together into groups composed entirely of matching records?”

After the labeling process is finished, you can upload your labels with a call to StartImportLabelsTaskRun. After StartImportLabelsTaskRun finishes, all future runs of the machine learning transform will use the new and improved labels and perform a higher-quality transformation.
Link to this function

start_ml_labeling_set_generation_task_run(Client, Input, Options)

View Source
Link to this function

start_trigger(Client, Input)

View Source

Starts an existing trigger.

See Triggering Jobs: https://docs.aws.amazon.com/glue/latest/dg/trigger-job.html for information about how different types of trigger are started.
Link to this function

start_trigger(Client, Input, Options)

View Source
Link to this function

start_workflow_run(Client, Input)

View Source
Starts a new run of the specified workflow.
Link to this function

start_workflow_run(Client, Input, Options)

View Source
Link to this function

stop_column_statistics_task_run(Client, Input)

View Source
Stops a task run for the specified table.
Link to this function

stop_column_statistics_task_run(Client, Input, Options)

View Source
Link to this function

stop_crawler(Client, Input)

View Source
If the specified crawler is running, stops the crawl.
Link to this function

stop_crawler(Client, Input, Options)

View Source
Link to this function

stop_crawler_schedule(Client, Input)

View Source
Sets the schedule state of the specified crawler to NOT_SCHEDULED, but does not stop the crawler if it is already running.
Link to this function

stop_crawler_schedule(Client, Input, Options)

View Source
Link to this function

stop_session(Client, Input)

View Source
Stops the session.
Link to this function

stop_session(Client, Input, Options)

View Source
Link to this function

stop_trigger(Client, Input)

View Source
Stops a specified trigger.
Link to this function

stop_trigger(Client, Input, Options)

View Source
Link to this function

stop_workflow_run(Client, Input)

View Source
Stops the execution of the specified workflow run.
Link to this function

stop_workflow_run(Client, Input, Options)

View Source
Link to this function

tag_resource(Client, Input)

View Source

Adds tags to a resource.

A tag is a label you can assign to an Amazon Web Services resource. In Glue, you can tag only certain resources. For information about what resources you can tag, see Amazon Web Services Tags in Glue: https://docs.aws.amazon.com/glue/latest/dg/monitor-tags.html.
Link to this function

tag_resource(Client, Input, Options)

View Source
Link to this function

untag_resource(Client, Input)

View Source
Removes tags from a resource.
Link to this function

untag_resource(Client, Input, Options)

View Source
Link to this function

update_blueprint(Client, Input)

View Source
Updates a registered blueprint.
Link to this function

update_blueprint(Client, Input, Options)

View Source
Link to this function

update_classifier(Client, Input)

View Source
Modifies an existing classifier (a GrokClassifier, an XMLClassifier, a JsonClassifier, or a CsvClassifier, depending on which field is present).
Link to this function

update_classifier(Client, Input, Options)

View Source
Link to this function

update_column_statistics_for_partition(Client, Input)

View Source

Creates or updates partition statistics of columns.

The Identity and Access Management (IAM) permission required for this operation is UpdatePartition.
Link to this function

update_column_statistics_for_partition(Client, Input, Options)

View Source
Link to this function

update_column_statistics_for_table(Client, Input)

View Source

Creates or updates table statistics of columns.

The Identity and Access Management (IAM) permission required for this operation is UpdateTable.
Link to this function

update_column_statistics_for_table(Client, Input, Options)

View Source
Link to this function

update_connection(Client, Input)

View Source
Updates a connection definition in the Data Catalog.
Link to this function

update_connection(Client, Input, Options)

View Source
Link to this function

update_crawler(Client, Input)

View Source

Updates a crawler.

If a crawler is running, you must stop it using StopCrawler before updating it.
Link to this function

update_crawler(Client, Input, Options)

View Source
Link to this function

update_crawler_schedule(Client, Input)

View Source
Updates the schedule of a crawler using a cron expression.
Link to this function

update_crawler_schedule(Client, Input, Options)

View Source
Link to this function

update_data_quality_ruleset(Client, Input)

View Source
Updates the specified data quality ruleset.
Link to this function

update_data_quality_ruleset(Client, Input, Options)

View Source
Link to this function

update_database(Client, Input)

View Source
Updates an existing database definition in a Data Catalog.
Link to this function

update_database(Client, Input, Options)

View Source
Link to this function

update_dev_endpoint(Client, Input)

View Source
Updates a specified development endpoint.
Link to this function

update_dev_endpoint(Client, Input, Options)

View Source
Link to this function

update_job(Client, Input)

View Source

Updates an existing job definition.

The previous job definition is completely overwritten by this information.
Link to this function

update_job(Client, Input, Options)

View Source
Link to this function

update_job_from_source_control(Client, Input)

View Source

Synchronizes a job from the source control repository.

This operation takes the job artifacts that are located in the remote repository and updates the Glue internal stores with these artifacts.

This API supports optional parameters which take in the repository information.
Link to this function

update_job_from_source_control(Client, Input, Options)

View Source
Link to this function

update_ml_transform(Client, Input)

View Source

Updates an existing machine learning transform.

Call this operation to tune the algorithm parameters to achieve better results.

After calling this operation, you can call the StartMLEvaluationTaskRun operation to assess how well your new parameters achieved your goals (such as improving the quality of your machine learning transform, or making it more cost-effective).
Link to this function

update_ml_transform(Client, Input, Options)

View Source
Link to this function

update_partition(Client, Input)

View Source
Updates a partition.
Link to this function

update_partition(Client, Input, Options)

View Source
Link to this function

update_registry(Client, Input)

View Source

Updates an existing registry which is used to hold a collection of schemas.

The updated properties relate to the registry, and do not modify any of the schemas within the registry.
Link to this function

update_registry(Client, Input, Options)

View Source
Link to this function

update_schema(Client, Input)

View Source

Updates the description, compatibility setting, or version checkpoint for a schema set.

For updating the compatibility setting, the call will not validate compatibility for the entire set of schema versions with the new compatibility setting. If the value for Compatibility is provided, the VersionNumber (a checkpoint) is also required. The API will validate the checkpoint version number for consistency.

If the value for the VersionNumber (checkpoint) is provided, Compatibility is optional and this can be used to set/reset a checkpoint for the schema.

This update will happen only if the schema is in the AVAILABLE state.
Link to this function

update_schema(Client, Input, Options)

View Source
Link to this function

update_source_control_from_job(Client, Input)

View Source

Synchronizes a job to the source control repository.

This operation takes the job artifacts from the Glue internal stores and makes a commit to the remote repository that is configured on the job.

This API supports optional parameters which take in the repository information.
Link to this function

update_source_control_from_job(Client, Input, Options)

View Source
Link to this function

update_table(Client, Input)

View Source
Updates a metadata table in the Data Catalog.
Link to this function

update_table(Client, Input, Options)

View Source
Link to this function

update_table_optimizer(Client, Input)

View Source
Updates the configuration for an existing table optimizer.
Link to this function

update_table_optimizer(Client, Input, Options)

View Source
Link to this function

update_trigger(Client, Input)

View Source
Updates a trigger definition.
Link to this function

update_trigger(Client, Input, Options)

View Source
Link to this function

update_user_defined_function(Client, Input)

View Source
Updates an existing function definition in the Data Catalog.
Link to this function

update_user_defined_function(Client, Input, Options)

View Source
Link to this function

update_workflow(Client, Input)

View Source
Updates an existing workflow.
Link to this function

update_workflow(Client, Input, Options)

View Source