aws_glue
AWS Glue
Defines the public endpoint for the AWS Glue service.Summary
Functions
-
batch_create_partition(Client, Input)
Creates one or more partitions in a batch operation.
- batch_create_partition(Client, Input, Options)
-
batch_delete_connection(Client, Input)
Deletes a list of connection definitions from the Data Catalog.
- batch_delete_connection(Client, Input, Options)
-
batch_delete_partition(Client, Input)
Deletes one or more partitions in a batch operation.
- batch_delete_partition(Client, Input, Options)
-
batch_delete_table(Client, Input)
Deletes multiple tables at once.
- batch_delete_table(Client, Input, Options)
-
batch_delete_table_version(Client, Input)
Deletes a specified batch of versions of a table.
- batch_delete_table_version(Client, Input, Options)
-
batch_get_crawlers(Client, Input)
Returns a list of resource metadata for a given list of crawler names.
- batch_get_crawlers(Client, Input, Options)
-
batch_get_dev_endpoints(Client, Input)
Returns a list of resource metadata for a given list of development endpoint names.
- batch_get_dev_endpoints(Client, Input, Options)
-
batch_get_jobs(Client, Input)
Returns a list of resource metadata for a given list of job names.
- batch_get_jobs(Client, Input, Options)
-
batch_get_partition(Client, Input)
Retrieves partitions in a batch request.
- batch_get_partition(Client, Input, Options)
-
batch_get_triggers(Client, Input)
Returns a list of resource metadata for a given list of trigger names.
- batch_get_triggers(Client, Input, Options)
-
batch_get_workflows(Client, Input)
Returns a list of resource metadata for a given list of workflow names.
- batch_get_workflows(Client, Input, Options)
-
batch_stop_job_run(Client, Input)
Stops one or more job runs for a specified job definition.
- batch_stop_job_run(Client, Input, Options)
-
batch_update_partition(Client, Input)
Updates one or more partitions in a batch operation.
- batch_update_partition(Client, Input, Options)
-
cancel_ml_task_run(Client, Input)
Cancels (stops) a task run.
- cancel_ml_task_run(Client, Input, Options)
-
check_schema_version_validity(Client, Input)
Validates the supplied schema.
- check_schema_version_validity(Client, Input, Options)
-
create_classifier(Client, Input)
Creates a classifier in the user's account.
- create_classifier(Client, Input, Options)
-
create_connection(Client, Input)
Creates a connection definition in the Data Catalog.
- create_connection(Client, Input, Options)
-
create_crawler(Client, Input)
Creates a new crawler with specified targets, role, configuration, and optional schedule.
- create_crawler(Client, Input, Options)
-
create_database(Client, Input)
Creates a new database in a Data Catalog.
- create_database(Client, Input, Options)
-
create_dev_endpoint(Client, Input)
Creates a new development endpoint.
- create_dev_endpoint(Client, Input, Options)
-
create_job(Client, Input)
Creates a new job definition.
- create_job(Client, Input, Options)
-
create_ml_transform(Client, Input)
Creates an AWS Glue machine learning transform.
- create_ml_transform(Client, Input, Options)
-
create_partition(Client, Input)
Creates a new partition.
- create_partition(Client, Input, Options)
-
create_partition_index(Client, Input)
Creates a specified partition index in an existing table.
- create_partition_index(Client, Input, Options)
-
create_registry(Client, Input)
Creates a new registry which may be used to hold a collection of schemas.
- create_registry(Client, Input, Options)
-
create_schema(Client, Input)
Creates a new schema set and registers the schema definition.
- create_schema(Client, Input, Options)
-
create_script(Client, Input)
Transforms a directed acyclic graph (DAG) into code.
- create_script(Client, Input, Options)
-
create_security_configuration(Client, Input)
Creates a new security configuration.
- create_security_configuration(Client, Input, Options)
-
create_table(Client, Input)
Creates a new table definition in the Data Catalog.
- create_table(Client, Input, Options)
-
create_trigger(Client, Input)
Creates a new trigger.
- create_trigger(Client, Input, Options)
-
create_user_defined_function(Client, Input)
Creates a new function definition in the Data Catalog.
- create_user_defined_function(Client, Input, Options)
-
create_workflow(Client, Input)
Creates a new workflow.
- create_workflow(Client, Input, Options)
-
delete_classifier(Client, Input)
Removes a classifier from the Data Catalog.
- delete_classifier(Client, Input, Options)
-
delete_column_statistics_for_partition(Client, Input)
Delete the partition column statistics of a column.
- delete_column_statistics_for_partition(Client, Input, Options)
-
delete_column_statistics_for_table(Client, Input)
Retrieves table statistics of columns.
- delete_column_statistics_for_table(Client, Input, Options)
-
delete_connection(Client, Input)
Deletes a connection from the Data Catalog.
- delete_connection(Client, Input, Options)
-
delete_crawler(Client, Input)
Removes a specified crawler from the AWS Glue Data Catalog, unless the crawler state is
RUNNING
. - delete_crawler(Client, Input, Options)
-
delete_database(Client, Input)
Removes a specified database from a Data Catalog.
- delete_database(Client, Input, Options)
-
delete_dev_endpoint(Client, Input)
Deletes a specified development endpoint.
- delete_dev_endpoint(Client, Input, Options)
-
delete_job(Client, Input)
Deletes a specified job definition.
- delete_job(Client, Input, Options)
-
delete_ml_transform(Client, Input)
Deletes an AWS Glue machine learning transform.
- delete_ml_transform(Client, Input, Options)
-
delete_partition(Client, Input)
Deletes a specified partition.
- delete_partition(Client, Input, Options)
-
delete_partition_index(Client, Input)
Deletes a specified partition index from an existing table.
- delete_partition_index(Client, Input, Options)
-
delete_registry(Client, Input)
Delete the entire registry including schema and all of its versions.
- delete_registry(Client, Input, Options)
-
delete_resource_policy(Client, Input)
Deletes a specified policy.
- delete_resource_policy(Client, Input, Options)
-
delete_schema(Client, Input)
Deletes the entire schema set, including the schema set and all of its versions.
- delete_schema(Client, Input, Options)
-
delete_schema_versions(Client, Input)
Remove versions from the specified schema.
- delete_schema_versions(Client, Input, Options)
-
delete_security_configuration(Client, Input)
Deletes a specified security configuration.
- delete_security_configuration(Client, Input, Options)
-
delete_table(Client, Input)
Removes a table definition from the Data Catalog.
- delete_table(Client, Input, Options)
-
delete_table_version(Client, Input)
Deletes a specified version of a table.
- delete_table_version(Client, Input, Options)
-
delete_trigger(Client, Input)
Deletes a specified trigger.
- delete_trigger(Client, Input, Options)
-
delete_user_defined_function(Client, Input)
Deletes an existing function definition from the Data Catalog.
- delete_user_defined_function(Client, Input, Options)
-
delete_workflow(Client, Input)
Deletes a workflow.
- delete_workflow(Client, Input, Options)
-
get_catalog_import_status(Client, Input)
Retrieves the status of a migration operation.
- get_catalog_import_status(Client, Input, Options)
-
get_classifier(Client, Input)
Retrieve a classifier by name.
- get_classifier(Client, Input, Options)
-
get_classifiers(Client, Input)
Lists all classifier objects in the Data Catalog.
- get_classifiers(Client, Input, Options)
-
get_column_statistics_for_partition(Client, Input)
Retrieves partition statistics of columns.
- get_column_statistics_for_partition(Client, Input, Options)
-
get_column_statistics_for_table(Client, Input)
Retrieves table statistics of columns.
- get_column_statistics_for_table(Client, Input, Options)
-
get_connection(Client, Input)
Retrieves a connection definition from the Data Catalog.
- get_connection(Client, Input, Options)
-
get_connections(Client, Input)
Retrieves a list of connection definitions from the Data Catalog.
- get_connections(Client, Input, Options)
-
get_crawler(Client, Input)
Retrieves metadata for a specified crawler.
- get_crawler(Client, Input, Options)
-
get_crawler_metrics(Client, Input)
Retrieves metrics about specified crawlers.
- get_crawler_metrics(Client, Input, Options)
-
get_crawlers(Client, Input)
Retrieves metadata for all crawlers defined in the customer account.
- get_crawlers(Client, Input, Options)
-
get_data_catalog_encryption_settings(Client, Input)
Retrieves the security configuration for a specified catalog.
- get_data_catalog_encryption_settings(Client, Input, Options)
-
get_database(Client, Input)
Retrieves the definition of a specified database.
- get_database(Client, Input, Options)
-
get_databases(Client, Input)
Retrieves all databases defined in a given Data Catalog.
- get_databases(Client, Input, Options)
-
get_dataflow_graph(Client, Input)
Transforms a Python script into a directed acyclic graph (DAG).
- get_dataflow_graph(Client, Input, Options)
-
get_dev_endpoint(Client, Input)
Retrieves information about a specified development endpoint.
- get_dev_endpoint(Client, Input, Options)
-
get_dev_endpoints(Client, Input)
Retrieves all the development endpoints in this AWS account.
- get_dev_endpoints(Client, Input, Options)
-
get_job(Client, Input)
Retrieves an existing job definition.
- get_job(Client, Input, Options)
-
get_job_bookmark(Client, Input)
Returns information on a job bookmark entry.
- get_job_bookmark(Client, Input, Options)
-
get_job_run(Client, Input)
Retrieves the metadata for a given job run.
- get_job_run(Client, Input, Options)
-
get_job_runs(Client, Input)
Retrieves metadata for all runs of a given job definition.
- get_job_runs(Client, Input, Options)
-
get_jobs(Client, Input)
Retrieves all current job definitions.
- get_jobs(Client, Input, Options)
-
get_mapping(Client, Input)
Creates mappings.
- get_mapping(Client, Input, Options)
-
get_ml_task_run(Client, Input)
Gets details for a specific task run on a machine learning transform.
- get_ml_task_run(Client, Input, Options)
-
get_ml_task_runs(Client, Input)
Gets a list of runs for a machine learning transform.
- get_ml_task_runs(Client, Input, Options)
-
get_ml_transform(Client, Input)
Gets an AWS Glue machine learning transform artifact and all its corresponding metadata.
- get_ml_transform(Client, Input, Options)
-
get_ml_transforms(Client, Input)
Gets a sortable, filterable list of existing AWS Glue machine learning transforms.
- get_ml_transforms(Client, Input, Options)
-
get_partition(Client, Input)
Retrieves information about a specified partition.
- get_partition(Client, Input, Options)
-
get_partition_indexes(Client, Input)
Retrieves the partition indexes associated with a table.
- get_partition_indexes(Client, Input, Options)
-
get_partitions(Client, Input)
Retrieves information about the partitions in a table.
- get_partitions(Client, Input, Options)
-
get_plan(Client, Input)
Gets code to perform a specified mapping.
- get_plan(Client, Input, Options)
-
get_registry(Client, Input)
Describes the specified registry in detail.
- get_registry(Client, Input, Options)
-
get_resource_policies(Client, Input)
Retrieves the security configurations for the resource policies set on individual resources, and also the account-level policy.
- get_resource_policies(Client, Input, Options)
-
get_resource_policy(Client, Input)
Retrieves a specified resource policy.
- get_resource_policy(Client, Input, Options)
-
get_schema(Client, Input)
Describes the specified schema in detail.
- get_schema(Client, Input, Options)
-
get_schema_by_definition(Client, Input)
Retrieves a schema by the
SchemaDefinition
. - get_schema_by_definition(Client, Input, Options)
-
get_schema_version(Client, Input)
Get the specified schema by its unique ID assigned when a version of the schema is created or registered.
- get_schema_version(Client, Input, Options)
-
get_schema_versions_diff(Client, Input)
Fetches the schema version difference in the specified difference type between two stored schema versions in the Schema Registry.
- get_schema_versions_diff(Client, Input, Options)
-
get_security_configuration(Client, Input)
Retrieves a specified security configuration.
- get_security_configuration(Client, Input, Options)
-
get_security_configurations(Client, Input)
Retrieves a list of all security configurations.
- get_security_configurations(Client, Input, Options)
-
get_table(Client, Input)
Retrieves the
Table
definition in a Data Catalog for a specified table. - get_table(Client, Input, Options)
-
get_table_version(Client, Input)
Retrieves a specified version of a table.
- get_table_version(Client, Input, Options)
-
get_table_versions(Client, Input)
Retrieves a list of strings that identify available versions of a specified table.
- get_table_versions(Client, Input, Options)
-
get_tables(Client, Input)
Retrieves the definitions of some or all of the tables in a given
Database
. - get_tables(Client, Input, Options)
-
get_tags(Client, Input)
Retrieves a list of tags associated with a resource.
- get_tags(Client, Input, Options)
-
get_trigger(Client, Input)
Retrieves the definition of a trigger.
- get_trigger(Client, Input, Options)
-
get_triggers(Client, Input)
Gets all the triggers associated with a job.
- get_triggers(Client, Input, Options)
-
get_user_defined_function(Client, Input)
Retrieves a specified function definition from the Data Catalog.
- get_user_defined_function(Client, Input, Options)
-
get_user_defined_functions(Client, Input)
Retrieves multiple function definitions from the Data Catalog.
- get_user_defined_functions(Client, Input, Options)
-
get_workflow(Client, Input)
Retrieves resource metadata for a workflow.
- get_workflow(Client, Input, Options)
-
get_workflow_run(Client, Input)
Retrieves the metadata for a given workflow run.
- get_workflow_run(Client, Input, Options)
-
get_workflow_run_properties(Client, Input)
Retrieves the workflow run properties which were set during the run.
- get_workflow_run_properties(Client, Input, Options)
-
get_workflow_runs(Client, Input)
Retrieves metadata for all runs of a given workflow.
- get_workflow_runs(Client, Input, Options)
-
import_catalog_to_glue(Client, Input)
Imports an existing Amazon Athena Data Catalog to AWS Glue.
- import_catalog_to_glue(Client, Input, Options)
-
list_crawlers(Client, Input)
Retrieves the names of all crawler resources in this AWS account, or the resources with the specified tag.
- list_crawlers(Client, Input, Options)
-
list_dev_endpoints(Client, Input)
Retrieves the names of all
DevEndpoint
resources in this AWS account, or the resources with the specified tag. - list_dev_endpoints(Client, Input, Options)
-
list_jobs(Client, Input)
Retrieves the names of all job resources in this AWS account, or the resources with the specified tag.
- list_jobs(Client, Input, Options)
-
list_ml_transforms(Client, Input)
Retrieves a sortable, filterable list of existing AWS Glue machine learning transforms in this AWS account, or the resources with the specified tag.
- list_ml_transforms(Client, Input, Options)
-
list_registries(Client, Input)
Returns a list of registries that you have created, with minimal registry information.
- list_registries(Client, Input, Options)
-
list_schema_versions(Client, Input)
Returns a list of schema versions that you have created, with minimal information.
- list_schema_versions(Client, Input, Options)
-
list_schemas(Client, Input)
Returns a list of schemas with minimal details.
- list_schemas(Client, Input, Options)
-
list_triggers(Client, Input)
Retrieves the names of all trigger resources in this AWS account, or the resources with the specified tag.
- list_triggers(Client, Input, Options)
-
list_workflows(Client, Input)
Lists names of workflows created in the account.
- list_workflows(Client, Input, Options)
-
put_data_catalog_encryption_settings(Client, Input)
Sets the security configuration for a specified catalog.
- put_data_catalog_encryption_settings(Client, Input, Options)
-
put_resource_policy(Client, Input)
Sets the Data Catalog resource policy for access control.
- put_resource_policy(Client, Input, Options)
-
put_schema_version_metadata(Client, Input)
Puts the metadata key value pair for a specified schema version ID.
- put_schema_version_metadata(Client, Input, Options)
-
put_workflow_run_properties(Client, Input)
Puts the specified workflow run properties for the given workflow run.
- put_workflow_run_properties(Client, Input, Options)
-
query_schema_version_metadata(Client, Input)
Queries for the schema version metadata information.
- query_schema_version_metadata(Client, Input, Options)
-
register_schema_version(Client, Input)
Adds a new version to the existing schema.
- register_schema_version(Client, Input, Options)
-
remove_schema_version_metadata(Client, Input)
Removes a key value pair from the schema version metadata for the specified schema version ID.
- remove_schema_version_metadata(Client, Input, Options)
-
reset_job_bookmark(Client, Input)
Resets a bookmark entry.
- reset_job_bookmark(Client, Input, Options)
-
resume_workflow_run(Client, Input)
Restarts selected nodes of a previous partially completed workflow run and resumes the workflow run.
- resume_workflow_run(Client, Input, Options)
-
search_tables(Client, Input)
Searches a set of tables based on properties in the table metadata as well as on the parent database.
- search_tables(Client, Input, Options)
-
start_crawler(Client, Input)
Starts a crawl using the specified crawler, regardless of what is scheduled.
- start_crawler(Client, Input, Options)
-
start_crawler_schedule(Client, Input)
Changes the schedule state of the specified crawler to
SCHEDULED
, unless the crawler is already running or the schedule state is alreadySCHEDULED
. - start_crawler_schedule(Client, Input, Options)
-
start_export_labels_task_run(Client, Input)
Begins an asynchronous task to export all labeled data for a particular transform.
- start_export_labels_task_run(Client, Input, Options)
-
start_import_labels_task_run(Client, Input)
Enables you to provide additional labels (examples of truth) to be used to teach the machine learning transform and improve its quality.
- start_import_labels_task_run(Client, Input, Options)
-
start_job_run(Client, Input)
Starts a job run using a job definition.
- start_job_run(Client, Input, Options)
-
start_ml_evaluation_task_run(Client, Input)
Starts a task to estimate the quality of the transform.
- start_ml_evaluation_task_run(Client, Input, Options)
-
start_ml_labeling_set_generation_task_run(Client, Input)
Starts the active learning workflow for your machine learning transform to improve the transform's quality by generating label sets and adding labels.
- start_ml_labeling_set_generation_task_run(Client, Input, Options)
-
start_trigger(Client, Input)
Starts an existing trigger.
- start_trigger(Client, Input, Options)
-
start_workflow_run(Client, Input)
Starts a new run of the specified workflow.
- start_workflow_run(Client, Input, Options)
-
stop_crawler(Client, Input)
If the specified crawler is running, stops the crawl.
- stop_crawler(Client, Input, Options)
-
stop_crawler_schedule(Client, Input)
Sets the schedule state of the specified crawler to
NOT_SCHEDULED
, but does not stop the crawler if it is already running. - stop_crawler_schedule(Client, Input, Options)
-
stop_trigger(Client, Input)
Stops a specified trigger.
- stop_trigger(Client, Input, Options)
-
stop_workflow_run(Client, Input)
Stops the execution of the specified workflow run.
- stop_workflow_run(Client, Input, Options)
-
tag_resource(Client, Input)
Adds tags to a resource.
- tag_resource(Client, Input, Options)
-
untag_resource(Client, Input)
Removes tags from a resource.
- untag_resource(Client, Input, Options)
-
update_classifier(Client, Input)
Modifies an existing classifier (a
GrokClassifier
, anXMLClassifier
, aJsonClassifier
, or aCsvClassifier
, depending on which field is present). - update_classifier(Client, Input, Options)
-
update_column_statistics_for_partition(Client, Input)
Creates or updates partition statistics of columns.
- update_column_statistics_for_partition(Client, Input, Options)
-
update_column_statistics_for_table(Client, Input)
Creates or updates table statistics of columns.
- update_column_statistics_for_table(Client, Input, Options)
-
update_connection(Client, Input)
Updates a connection definition in the Data Catalog.
- update_connection(Client, Input, Options)
-
update_crawler(Client, Input)
Updates a crawler.
- update_crawler(Client, Input, Options)
-
update_crawler_schedule(Client, Input)
Updates the schedule of a crawler using a
cron
expression. - update_crawler_schedule(Client, Input, Options)
-
update_database(Client, Input)
Updates an existing database definition in a Data Catalog.
- update_database(Client, Input, Options)
-
update_dev_endpoint(Client, Input)
Updates a specified development endpoint.
- update_dev_endpoint(Client, Input, Options)
-
update_job(Client, Input)
Updates an existing job definition.
- update_job(Client, Input, Options)
-
update_ml_transform(Client, Input)
Updates an existing machine learning transform.
- update_ml_transform(Client, Input, Options)
-
update_partition(Client, Input)
Updates a partition.
- update_partition(Client, Input, Options)
-
update_registry(Client, Input)
Updates an existing registry which is used to hold a collection of schemas.
- update_registry(Client, Input, Options)
-
update_schema(Client, Input)
Updates the description, compatibility setting, or version checkpoint for a schema set.
- update_schema(Client, Input, Options)
-
update_table(Client, Input)
Updates a metadata table in the Data Catalog.
- update_table(Client, Input, Options)
-
update_trigger(Client, Input)
Updates a trigger definition.
- update_trigger(Client, Input, Options)
-
update_user_defined_function(Client, Input)
Updates an existing function definition in the Data Catalog.
- update_user_defined_function(Client, Input, Options)
-
update_workflow(Client, Input)
Updates an existing workflow.
- update_workflow(Client, Input, Options)
Functions
batch_create_partition(Client, Input)
Creates one or more partitions in a batch operation.
batch_create_partition(Client, Input, Options)
batch_delete_connection(Client, Input)
Deletes a list of connection definitions from the Data Catalog.
batch_delete_connection(Client, Input, Options)
batch_delete_partition(Client, Input)
Deletes one or more partitions in a batch operation.
batch_delete_partition(Client, Input, Options)
batch_delete_table(Client, Input)
Deletes multiple tables at once.
After completing this operation, you no longer have access to the table versions and partitions that belong to the deleted table. AWS Glue deletes these "orphaned" resources asynchronously in a timely manner, at the discretion of the service.
To ensure the immediate deletion of all related resources, before callingBatchDeleteTable
, use DeleteTableVersion
or BatchDeleteTableVersion
,
and DeletePartition
or BatchDeletePartition
, to delete any resources
that belong to the table.
batch_delete_table(Client, Input, Options)
batch_delete_table_version(Client, Input)
Deletes a specified batch of versions of a table.
batch_delete_table_version(Client, Input, Options)
batch_get_crawlers(Client, Input)
Returns a list of resource metadata for a given list of crawler names.
After calling theListCrawlers
operation, you can call this operation to
access the data to which you have been granted permissions. This operation
supports all IAM permissions, including permission conditions that uses
tags.
batch_get_crawlers(Client, Input, Options)
batch_get_dev_endpoints(Client, Input)
Returns a list of resource metadata for a given list of development endpoint names.
After calling theListDevEndpoints
operation, you can call this
operation to access the data to which you have been granted permissions.
This operation supports all IAM permissions, including permission
conditions that uses tags.
batch_get_dev_endpoints(Client, Input, Options)
batch_get_jobs(Client, Input)
Returns a list of resource metadata for a given list of job names.
After calling theListJobs
operation, you can call this operation to
access the data to which you have been granted permissions. This operation
supports all IAM permissions, including permission conditions that uses
tags.
batch_get_jobs(Client, Input, Options)
batch_get_partition(Client, Input)
Retrieves partitions in a batch request.
batch_get_partition(Client, Input, Options)
batch_get_triggers(Client, Input)
Returns a list of resource metadata for a given list of trigger names.
After calling theListTriggers
operation, you can call this operation to
access the data to which you have been granted permissions. This operation
supports all IAM permissions, including permission conditions that uses
tags.
batch_get_triggers(Client, Input, Options)
batch_get_workflows(Client, Input)
Returns a list of resource metadata for a given list of workflow names.
After calling theListWorkflows
operation, you can call this operation
to access the data to which you have been granted permissions. This
operation supports all IAM permissions, including permission conditions
that uses tags.
batch_get_workflows(Client, Input, Options)
batch_stop_job_run(Client, Input)
Stops one or more job runs for a specified job definition.
batch_stop_job_run(Client, Input, Options)
batch_update_partition(Client, Input)
Updates one or more partitions in a batch operation.
batch_update_partition(Client, Input, Options)
cancel_ml_task_run(Client, Input)
Cancels (stops) a task run.
Machine learning task runs are asynchronous tasks that AWS Glue runs on your behalf as part of various machine learning workflows. You can cancel a machine learning task run at any time by callingCancelMLTaskRun
with
a task run's parent transform's TransformID
and the task run's
TaskRunId
.
cancel_ml_task_run(Client, Input, Options)
check_schema_version_validity(Client, Input)
Validates the supplied schema.
This call has no side effects, it simply validates using the supplied schema usingDataFormat
as the format. Since it does not take a schema
set name, no compatibility checks are performed.
check_schema_version_validity(Client, Input, Options)
create_classifier(Client, Input)
Creates a classifier in the user's account.
This can be aGrokClassifier
, an XMLClassifier
, a JsonClassifier
, or
a CsvClassifier
, depending on which field of the request is present.
create_classifier(Client, Input, Options)
create_connection(Client, Input)
Creates a connection definition in the Data Catalog.
create_connection(Client, Input, Options)
create_crawler(Client, Input)
Creates a new crawler with specified targets, role, configuration, and optional schedule.
At least one crawl target must be specified, in thes3Targets
field, the
jdbcTargets
field, or the DynamoDBTargets
field.
create_crawler(Client, Input, Options)
create_database(Client, Input)
Creates a new database in a Data Catalog.
create_database(Client, Input, Options)
create_dev_endpoint(Client, Input)
Creates a new development endpoint.
create_dev_endpoint(Client, Input, Options)
create_job(Client, Input)
Creates a new job definition.
create_job(Client, Input, Options)
create_ml_transform(Client, Input)
Creates an AWS Glue machine learning transform.
This operation creates the transform and all the necessary parameters to train it.
Call this operation as the first step in the process of using a machine
learning transform (such as the FindMatches
transform) for deduplicating
data. You can provide an optional Description
, in addition to the
parameters that you want to use for your algorithm.
Role
,
and optionally, AllocatedCapacity
, Timeout
, and MaxRetries
. For more
information, see Jobs.
create_ml_transform(Client, Input, Options)
create_partition(Client, Input)
Creates a new partition.
create_partition(Client, Input, Options)
create_partition_index(Client, Input)
Creates a specified partition index in an existing table.
create_partition_index(Client, Input, Options)
create_registry(Client, Input)
Creates a new registry which may be used to hold a collection of schemas.
create_registry(Client, Input, Options)
create_schema(Client, Input)
Creates a new schema set and registers the schema definition.
Returns an error if the schema set already exists without actually registering the version.
When the schema set is created, a version checkpoint will be set to the
first version. Compatibility mode "DISABLED" restricts any additional
schema versions from being added after the first schema version. For all
other compatibility modes, validation of compatibility settings will be
applied only from the second version onwards when the
RegisterSchemaVersion
API is used.
RegistryId
, this will create an entry
for a "default-registry" in the registry database tables, if it is not
already present.
create_schema(Client, Input, Options)
create_script(Client, Input)
Transforms a directed acyclic graph (DAG) into code.
create_script(Client, Input, Options)
create_security_configuration(Client, Input)
Creates a new security configuration.
A security configuration is a set of security properties that can be used by AWS Glue. You can use a security configuration to encrypt data at rest. For information about using security configurations in AWS Glue, see Encrypting Data Written by Crawlers, Jobs, and Development Endpoints.create_security_configuration(Client, Input, Options)
create_table(Client, Input)
Creates a new table definition in the Data Catalog.
create_table(Client, Input, Options)
create_trigger(Client, Input)
Creates a new trigger.
create_trigger(Client, Input, Options)
create_user_defined_function(Client, Input)
Creates a new function definition in the Data Catalog.
create_user_defined_function(Client, Input, Options)
create_workflow(Client, Input)
Creates a new workflow.
create_workflow(Client, Input, Options)
delete_classifier(Client, Input)
Removes a classifier from the Data Catalog.
delete_classifier(Client, Input, Options)
delete_column_statistics_for_partition(Client, Input)
Delete the partition column statistics of a column.
The Identity and Access Management (IAM) permission required for this operation isDeletePartition
.
delete_column_statistics_for_partition(Client, Input, Options)
delete_column_statistics_for_table(Client, Input)
Retrieves table statistics of columns.
The Identity and Access Management (IAM) permission required for this operation isDeleteTable
.
delete_column_statistics_for_table(Client, Input, Options)
delete_connection(Client, Input)
Deletes a connection from the Data Catalog.
delete_connection(Client, Input, Options)
delete_crawler(Client, Input)
Removes a specified crawler from the AWS Glue Data Catalog, unless
the crawler state is RUNNING
.
delete_crawler(Client, Input, Options)
delete_database(Client, Input)
Removes a specified database from a Data Catalog.
After completing this operation, you no longer have access to the tables (and all table versions and partitions that might belong to the tables) and the user-defined functions in the deleted database. AWS Glue deletes these "orphaned" resources asynchronously in a timely manner, at the discretion of the service.
To ensure the immediate deletion of all related resources, before callingDeleteDatabase
, use DeleteTableVersion
or BatchDeleteTableVersion
,
DeletePartition
or BatchDeletePartition
, DeleteUserDefinedFunction
,
and DeleteTable
or BatchDeleteTable
, to delete any resources that
belong to the database.
delete_database(Client, Input, Options)
delete_dev_endpoint(Client, Input)
Deletes a specified development endpoint.
delete_dev_endpoint(Client, Input, Options)
delete_job(Client, Input)
Deletes a specified job definition.
If the job definition is not found, no exception is thrown.delete_job(Client, Input, Options)
delete_ml_transform(Client, Input)
Deletes an AWS Glue machine learning transform.
Machine learning transforms are a special type of transform that use machine learning to learn the details of the transformation to be performed by learning from examples provided by humans. These transformations are then saved by AWS Glue. If you no longer need a transform, you can delete it by callingDeleteMLTransforms
. However, any
AWS Glue jobs that still reference the deleted transform will no longer
succeed.
delete_ml_transform(Client, Input, Options)
delete_partition(Client, Input)
Deletes a specified partition.
delete_partition(Client, Input, Options)
delete_partition_index(Client, Input)
Deletes a specified partition index from an existing table.
delete_partition_index(Client, Input, Options)
delete_registry(Client, Input)
Delete the entire registry including schema and all of its versions.
To get the status of the delete operation, you can call theGetRegistry
API after the asynchronous call. Deleting a registry will disable all
online operations for the registry such as the UpdateRegistry
,
CreateSchema
, UpdateSchema
, and RegisterSchemaVersion
APIs.
delete_registry(Client, Input, Options)
delete_resource_policy(Client, Input)
Deletes a specified policy.
delete_resource_policy(Client, Input, Options)
delete_schema(Client, Input)
Deletes the entire schema set, including the schema set and all of its versions.
To get the status of the delete operation, you can callGetSchema
API
after the asynchronous call. Deleting a registry will disable all online
operations for the schema, such as the GetSchemaByDefinition
, and
RegisterSchemaVersion
APIs.
delete_schema(Client, Input, Options)
delete_schema_versions(Client, Input)
Remove versions from the specified schema.
A version number or range may be supplied. If the compatibility mode
forbids deleting of a version that is necessary, such as BACKWARDS_FULL,
an error is returned. Calling the GetSchemaVersions
API after this call
will list the status of the deleted versions.
When the range of version numbers contain check pointed version, the API
will return a 409 conflict and will not proceed with the deletion. You
have to remove the checkpoint first using the DeleteSchemaCheckpoint
API
before using this API.
You cannot use the DeleteSchemaVersions
API to delete the first schema
version in the schema set. The first schema version can only be deleted by
the DeleteSchema
API. This operation will also delete the attached
SchemaVersionMetadata
under the schema versions. Hard deletes will be
enforced on the database.
delete_schema_versions(Client, Input, Options)
delete_security_configuration(Client, Input)
Deletes a specified security configuration.
delete_security_configuration(Client, Input, Options)
delete_table(Client, Input)
Removes a table definition from the Data Catalog.
After completing this operation, you no longer have access to the table versions and partitions that belong to the deleted table. AWS Glue deletes these "orphaned" resources asynchronously in a timely manner, at the discretion of the service.
To ensure the immediate deletion of all related resources, before callingDeleteTable
, use DeleteTableVersion
or BatchDeleteTableVersion
, and
DeletePartition
or BatchDeletePartition
, to delete any resources that
belong to the table.
delete_table(Client, Input, Options)
delete_table_version(Client, Input)
Deletes a specified version of a table.
delete_table_version(Client, Input, Options)
delete_trigger(Client, Input)
Deletes a specified trigger.
If the trigger is not found, no exception is thrown.delete_trigger(Client, Input, Options)
delete_user_defined_function(Client, Input)
Deletes an existing function definition from the Data Catalog.
delete_user_defined_function(Client, Input, Options)
delete_workflow(Client, Input)
Deletes a workflow.
delete_workflow(Client, Input, Options)
get_catalog_import_status(Client, Input)
Retrieves the status of a migration operation.
get_catalog_import_status(Client, Input, Options)
get_classifier(Client, Input)
Retrieve a classifier by name.
get_classifier(Client, Input, Options)
get_classifiers(Client, Input)
Lists all classifier objects in the Data Catalog.
get_classifiers(Client, Input, Options)
get_column_statistics_for_partition(Client, Input)
Retrieves partition statistics of columns.
The Identity and Access Management (IAM) permission required for this operation isGetPartition
.
get_column_statistics_for_partition(Client, Input, Options)
get_column_statistics_for_table(Client, Input)
Retrieves table statistics of columns.
The Identity and Access Management (IAM) permission required for this operation isGetTable
.
get_column_statistics_for_table(Client, Input, Options)
get_connection(Client, Input)
Retrieves a connection definition from the Data Catalog.
get_connection(Client, Input, Options)
get_connections(Client, Input)
Retrieves a list of connection definitions from the Data Catalog.
get_connections(Client, Input, Options)
get_crawler(Client, Input)
Retrieves metadata for a specified crawler.
get_crawler(Client, Input, Options)
get_crawler_metrics(Client, Input)
Retrieves metrics about specified crawlers.
get_crawler_metrics(Client, Input, Options)
get_crawlers(Client, Input)
Retrieves metadata for all crawlers defined in the customer account.
get_crawlers(Client, Input, Options)
get_data_catalog_encryption_settings(Client, Input)
Retrieves the security configuration for a specified catalog.
get_data_catalog_encryption_settings(Client, Input, Options)
get_database(Client, Input)
Retrieves the definition of a specified database.
get_database(Client, Input, Options)
get_databases(Client, Input)
Retrieves all databases defined in a given Data Catalog.
get_databases(Client, Input, Options)
get_dataflow_graph(Client, Input)
Transforms a Python script into a directed acyclic graph (DAG).
get_dataflow_graph(Client, Input, Options)
get_dev_endpoint(Client, Input)
Retrieves information about a specified development endpoint.
When you create a development endpoint in a virtual private cloud (VPC), AWS Glue returns only a private IP address, and the public IP address field is not populated. When you create a non-VPC development endpoint, AWS Glue returns only a public IP address.get_dev_endpoint(Client, Input, Options)
get_dev_endpoints(Client, Input)
Retrieves all the development endpoints in this AWS account.
When you create a development endpoint in a virtual private cloud (VPC), AWS Glue returns only a private IP address and the public IP address field is not populated. When you create a non-VPC development endpoint, AWS Glue returns only a public IP address.get_dev_endpoints(Client, Input, Options)
get_job(Client, Input)
Retrieves an existing job definition.
get_job(Client, Input, Options)
get_job_bookmark(Client, Input)
Returns information on a job bookmark entry.
get_job_bookmark(Client, Input, Options)
get_job_run(Client, Input)
Retrieves the metadata for a given job run.
get_job_run(Client, Input, Options)
get_job_runs(Client, Input)
Retrieves metadata for all runs of a given job definition.
get_job_runs(Client, Input, Options)
get_jobs(Client, Input)
Retrieves all current job definitions.
get_jobs(Client, Input, Options)
get_mapping(Client, Input)
Creates mappings.
get_mapping(Client, Input, Options)
get_ml_task_run(Client, Input)
Gets details for a specific task run on a machine learning transform.
Machine learning task runs are asynchronous tasks that AWS Glue runs on your behalf as part of various machine learning workflows. You can check the stats of any task run by callingGetMLTaskRun
with the TaskRunID
and its parent transform's TransformID
.
get_ml_task_run(Client, Input, Options)
get_ml_task_runs(Client, Input)
Gets a list of runs for a machine learning transform.
Machine learning task runs are asynchronous tasks that AWS Glue runs on
your behalf as part of various machine learning workflows. You can get a
sortable, filterable list of machine learning task runs by calling
GetMLTaskRuns
with their parent transform's TransformID
and other
optional parameters as documented in this section.
get_ml_task_runs(Client, Input, Options)
get_ml_transform(Client, Input)
Gets an AWS Glue machine learning transform artifact and all its corresponding metadata.
Machine learning transforms are a special type of transform that use machine learning to learn the details of the transformation to be performed by learning from examples provided by humans. These transformations are then saved by AWS Glue. You can retrieve their metadata by callingGetMLTransform
.
get_ml_transform(Client, Input, Options)
get_ml_transforms(Client, Input)
Gets a sortable, filterable list of existing AWS Glue machine learning transforms.
Machine learning transforms are a special type of transform that use machine learning to learn the details of the transformation to be performed by learning from examples provided by humans. These transformations are then saved by AWS Glue, and you can retrieve their metadata by callingGetMLTransforms
.
get_ml_transforms(Client, Input, Options)
get_partition(Client, Input)
Retrieves information about a specified partition.
get_partition(Client, Input, Options)
get_partition_indexes(Client, Input)
Retrieves the partition indexes associated with a table.
get_partition_indexes(Client, Input, Options)
get_partitions(Client, Input)
Retrieves information about the partitions in a table.
get_partitions(Client, Input, Options)
get_plan(Client, Input)
Gets code to perform a specified mapping.
get_plan(Client, Input, Options)
get_registry(Client, Input)
Describes the specified registry in detail.
get_registry(Client, Input, Options)
get_resource_policies(Client, Input)
Retrieves the security configurations for the resource policies set on individual resources, and also the account-level policy.
This operation also returns the Data Catalog resource policy. However, if you enabled metadata encryption in Data Catalog settings, and you do not have permission on the AWS KMS key, the operation can't return the Data Catalog resource policy.get_resource_policies(Client, Input, Options)
get_resource_policy(Client, Input)
Retrieves a specified resource policy.
get_resource_policy(Client, Input, Options)
get_schema(Client, Input)
Describes the specified schema in detail.
get_schema(Client, Input, Options)
get_schema_by_definition(Client, Input)
Retrieves a schema by the SchemaDefinition
.
SchemaName
or ARN
(or the default registry, if none is supplied), that schema’s metadata is
returned. Otherwise, a 404 or NotFound error is returned. Schema versions
in Deleted
statuses will not be included in the results.
get_schema_by_definition(Client, Input, Options)
get_schema_version(Client, Input)
Get the specified schema by its unique ID assigned when a version of the schema is created or registered.
Schema versions in Deleted status will not be included in the results.get_schema_version(Client, Input, Options)
get_schema_versions_diff(Client, Input)
Fetches the schema version difference in the specified difference type between two stored schema versions in the Schema Registry.
This API allows you to compare two schema versions between two schema definitions under the same schema.get_schema_versions_diff(Client, Input, Options)
get_security_configuration(Client, Input)
Retrieves a specified security configuration.
get_security_configuration(Client, Input, Options)
get_security_configurations(Client, Input)
Retrieves a list of all security configurations.
get_security_configurations(Client, Input, Options)
get_table(Client, Input)
Retrieves the Table
definition in a Data Catalog for a specified
table.
get_table(Client, Input, Options)
get_table_version(Client, Input)
Retrieves a specified version of a table.
get_table_version(Client, Input, Options)
get_table_versions(Client, Input)
Retrieves a list of strings that identify available versions of a specified table.
get_table_versions(Client, Input, Options)
get_tables(Client, Input)
Retrieves the definitions of some or all of the tables in a given
Database
.
get_tables(Client, Input, Options)
get_tags(Client, Input)
Retrieves a list of tags associated with a resource.
get_tags(Client, Input, Options)
get_trigger(Client, Input)
Retrieves the definition of a trigger.
get_trigger(Client, Input, Options)
get_triggers(Client, Input)
Gets all the triggers associated with a job.
get_triggers(Client, Input, Options)
get_user_defined_function(Client, Input)
Retrieves a specified function definition from the Data Catalog.
get_user_defined_function(Client, Input, Options)
get_user_defined_functions(Client, Input)
Retrieves multiple function definitions from the Data Catalog.
get_user_defined_functions(Client, Input, Options)
get_workflow(Client, Input)
Retrieves resource metadata for a workflow.
get_workflow(Client, Input, Options)
get_workflow_run(Client, Input)
Retrieves the metadata for a given workflow run.
get_workflow_run(Client, Input, Options)
get_workflow_run_properties(Client, Input)
Retrieves the workflow run properties which were set during the run.
get_workflow_run_properties(Client, Input, Options)
get_workflow_runs(Client, Input)
Retrieves metadata for all runs of a given workflow.
get_workflow_runs(Client, Input, Options)
import_catalog_to_glue(Client, Input)
Imports an existing Amazon Athena Data Catalog to AWS Glue
import_catalog_to_glue(Client, Input, Options)
list_crawlers(Client, Input)
Retrieves the names of all crawler resources in this AWS account, or the resources with the specified tag.
This operation allows you to see which resources are available in your account, and their names.
This operation takes the optionalTags
field, which you can use as a
filter on the response so that tagged resources can be retrieved as a
group. If you choose to use tags filtering, only resources with the tag
are retrieved.
list_crawlers(Client, Input, Options)
list_dev_endpoints(Client, Input)
Retrieves the names of all DevEndpoint
resources in this AWS
account, or the resources with the specified tag.
This operation allows you to see which resources are available in your account, and their names.
This operation takes the optionalTags
field, which you can use as a
filter on the response so that tagged resources can be retrieved as a
group. If you choose to use tags filtering, only resources with the tag
are retrieved.
list_dev_endpoints(Client, Input, Options)
list_jobs(Client, Input)
Retrieves the names of all job resources in this AWS account, or the resources with the specified tag.
This operation allows you to see which resources are available in your account, and their names.
This operation takes the optionalTags
field, which you can use as a
filter on the response so that tagged resources can be retrieved as a
group. If you choose to use tags filtering, only resources with the tag
are retrieved.
list_jobs(Client, Input, Options)
list_ml_transforms(Client, Input)
Retrieves a sortable, filterable list of existing AWS Glue machine learning transforms in this AWS account, or the resources with the specified tag.
This operation takes the optionalTags
field, which you can use as a
filter of the responses so that tagged resources can be retrieved as a
group. If you choose to use tag filtering, only resources with the tags
are retrieved.
list_ml_transforms(Client, Input, Options)
list_registries(Client, Input)
Returns a list of registries that you have created, with minimal registry information.
Registries in theDeleting
status will not be included in the results.
Empty results will be returned if there are no registries available.
list_registries(Client, Input, Options)
list_schema_versions(Client, Input)
Returns a list of schema versions that you have created, with minimal information.
Schema versions in Deleted status will not be included in the results. Empty results will be returned if there are no schema versions available.list_schema_versions(Client, Input, Options)
list_schemas(Client, Input)
Returns a list of schemas with minimal details.
Schemas in Deleting status will not be included in the results. Empty results will be returned if there are no schemas available.
When theRegistryId
is not provided, all the schemas across registries
will be part of the API response.
list_schemas(Client, Input, Options)
list_triggers(Client, Input)
Retrieves the names of all trigger resources in this AWS account, or the resources with the specified tag.
This operation allows you to see which resources are available in your account, and their names.
This operation takes the optionalTags
field, which you can use as a
filter on the response so that tagged resources can be retrieved as a
group. If you choose to use tags filtering, only resources with the tag
are retrieved.
list_triggers(Client, Input, Options)
list_workflows(Client, Input)
Lists names of workflows created in the account.
list_workflows(Client, Input, Options)
put_data_catalog_encryption_settings(Client, Input)
Sets the security configuration for a specified catalog.
After the configuration has been set, the specified encryption is applied to every catalog write thereafter.put_data_catalog_encryption_settings(Client, Input, Options)
put_resource_policy(Client, Input)
Sets the Data Catalog resource policy for access control.
put_resource_policy(Client, Input, Options)
put_schema_version_metadata(Client, Input)
Puts the metadata key value pair for a specified schema version ID.
A maximum of 10 key value pairs will be allowed per schema version. They can be added over one or more calls.put_schema_version_metadata(Client, Input, Options)
put_workflow_run_properties(Client, Input)
Puts the specified workflow run properties for the given workflow run.
If a property already exists for the specified run, then it overrides the value otherwise adds the property to existing properties.put_workflow_run_properties(Client, Input, Options)
query_schema_version_metadata(Client, Input)
Queries for the schema version metadata information.
query_schema_version_metadata(Client, Input, Options)
register_schema_version(Client, Input)
Adds a new version to the existing schema.
Returns an error if new version of schema does not meet the compatibility requirements of the schema set. This API will not create a new schema set and will return a 404 error if the schema set is not already present in the Schema Registry.
If this is the first schema definition to be registered in the Schema
Registry, this API will store the schema version and return immediately.
Otherwise, this call has the potential to run longer than other operations
due to compatibility modes. You can call the GetSchemaVersion
API with
the SchemaVersionId
to check compatibility modes.
register_schema_version(Client, Input, Options)
remove_schema_version_metadata(Client, Input)
Removes a key value pair from the schema version metadata for the specified schema version ID.
remove_schema_version_metadata(Client, Input, Options)
reset_job_bookmark(Client, Input)
Resets a bookmark entry.
reset_job_bookmark(Client, Input, Options)
resume_workflow_run(Client, Input)
Restarts selected nodes of a previous partially completed workflow run and resumes the workflow run.
The selected nodes and all nodes that are downstream from the selected nodes are run.resume_workflow_run(Client, Input, Options)
search_tables(Client, Input)
Searches a set of tables based on properties in the table metadata as well as on the parent database.
You can search against text or filter conditions.
You can only get tables that you have access to based on the security policies defined in Lake Formation. You need at least a read-only access to the table for it to be returned. If you do not have access to all the columns in the table, these columns will not be searched against when returning the list of tables back to you. If you have access to the columns but not the data in the columns, those columns and the associated metadata for those columns will be included in the search.search_tables(Client, Input, Options)
start_crawler(Client, Input)
Starts a crawl using the specified crawler, regardless of what is scheduled.
If the crawler is already running, returns a CrawlerRunningException.start_crawler(Client, Input, Options)
start_crawler_schedule(Client, Input)
Changes the schedule state of the specified crawler to SCHEDULED
,
unless the crawler is already running or the schedule state is already
SCHEDULED
.
start_crawler_schedule(Client, Input, Options)
start_export_labels_task_run(Client, Input)
Begins an asynchronous task to export all labeled data for a particular transform.
This task is the only label-related API call that is not part of the typical active learning workflow. You typically useStartExportLabelsTaskRun
when you want to work with all of your existing
labels at the same time, such as when you want to remove or change labels
that were previously submitted as truth. This API operation accepts the
TransformId
whose labels you want to export and an Amazon Simple Storage
Service (Amazon S3) path to export the labels to. The operation returns a
TaskRunId
. You can check on the status of your task run by calling the
GetMLTaskRun
API.
start_export_labels_task_run(Client, Input, Options)
start_import_labels_task_run(Client, Input)
Enables you to provide additional labels (examples of truth) to be used to teach the machine learning transform and improve its quality.
This API operation is generally used as part of the active learning
workflow that starts with the StartMLLabelingSetGenerationTaskRun
call
and that ultimately results in improving the quality of your machine
learning transform.
After the StartMLLabelingSetGenerationTaskRun
finishes, AWS Glue machine
learning will have generated a series of questions for humans to answer.
(Answering these questions is often called 'labeling' in the machine
learning workflows). In the case of the FindMatches
transform, these
questions are of the form, “What is the correct way to group these rows
together into groups composed entirely of matching records?” After the
labeling process is finished, users upload their answers/labels with a
call to StartImportLabelsTaskRun
. After StartImportLabelsTaskRun
finishes, all future runs of the machine learning transform use the new
and improved labels and perform a higher-quality transformation.
By default, StartMLLabelingSetGenerationTaskRun
continually learns from
and combines all labels that you upload unless you set Replace
to true.
If you set Replace
to true, StartImportLabelsTaskRun
deletes and
forgets all previously uploaded labels and learns only from the exact set
that you upload. Replacing labels can be helpful if you realize that you
previously uploaded incorrect labels, and you believe that they are having
a negative effect on your transform quality.
GetMLTaskRun
operation.
start_import_labels_task_run(Client, Input, Options)
start_job_run(Client, Input)
Starts a job run using a job definition.
start_job_run(Client, Input, Options)
start_ml_evaluation_task_run(Client, Input)
Starts a task to estimate the quality of the transform.
When you provide label sets as examples of truth, AWS Glue machine learning uses some of those examples to learn from them. The rest of the labels are used as a test to estimate quality.
Returns a unique identifier for the run. You can callGetMLTaskRun
to
get more information about the stats of the EvaluationTaskRun
.
start_ml_evaluation_task_run(Client, Input, Options)
start_ml_labeling_set_generation_task_run(Client, Input)
Starts the active learning workflow for your machine learning transform to improve the transform's quality by generating label sets and adding labels.
When the StartMLLabelingSetGenerationTaskRun
finishes, AWS Glue will
have generated a "labeling set" or a set of questions for humans to
answer.
In the case of the FindMatches
transform, these questions are of the
form, “What is the correct way to group these rows together into groups
composed entirely of matching records?”
StartImportLabelsTaskRun
. After StartImportLabelsTaskRun
finishes, all future runs of the machine learning transform will use the
new and improved labels and perform a higher-quality transformation.
start_ml_labeling_set_generation_task_run(Client, Input, Options)
start_trigger(Client, Input)
Starts an existing trigger.
See Triggering Jobs for information about how different types of trigger are started.start_trigger(Client, Input, Options)
start_workflow_run(Client, Input)
Starts a new run of the specified workflow.
start_workflow_run(Client, Input, Options)
stop_crawler(Client, Input)
If the specified crawler is running, stops the crawl.
stop_crawler(Client, Input, Options)
stop_crawler_schedule(Client, Input)
Sets the schedule state of the specified crawler to NOT_SCHEDULED
,
but does not stop the crawler if it is already running.
stop_crawler_schedule(Client, Input, Options)
stop_trigger(Client, Input)
Stops a specified trigger.
stop_trigger(Client, Input, Options)
stop_workflow_run(Client, Input)
Stops the execution of the specified workflow run.
stop_workflow_run(Client, Input, Options)
tag_resource(Client, Input)
Adds tags to a resource.
A tag is a label you can assign to an AWS resource. In AWS Glue, you can tag only certain resources. For information about what resources you can tag, see AWS Tags in AWS Glue.tag_resource(Client, Input, Options)
untag_resource(Client, Input)
Removes tags from a resource.
untag_resource(Client, Input, Options)
update_classifier(Client, Input)
Modifies an existing classifier (a GrokClassifier
, an
XMLClassifier
, a JsonClassifier
, or a CsvClassifier
, depending on
which field is present).
update_classifier(Client, Input, Options)
update_column_statistics_for_partition(Client, Input)
Creates or updates partition statistics of columns.
The Identity and Access Management (IAM) permission required for this operation isUpdatePartition
.
update_column_statistics_for_partition(Client, Input, Options)
update_column_statistics_for_table(Client, Input)
Creates or updates table statistics of columns.
The Identity and Access Management (IAM) permission required for this operation isUpdateTable
.
update_column_statistics_for_table(Client, Input, Options)
update_connection(Client, Input)
Updates a connection definition in the Data Catalog.
update_connection(Client, Input, Options)
update_crawler(Client, Input)
Updates a crawler.
If a crawler is running, you must stop it usingStopCrawler
before
updating it.
update_crawler(Client, Input, Options)
update_crawler_schedule(Client, Input)
Updates the schedule of a crawler using a cron
expression.
update_crawler_schedule(Client, Input, Options)
update_database(Client, Input)
Updates an existing database definition in a Data Catalog.
update_database(Client, Input, Options)
update_dev_endpoint(Client, Input)
Updates a specified development endpoint.
update_dev_endpoint(Client, Input, Options)
update_job(Client, Input)
Updates an existing job definition.
update_job(Client, Input, Options)
update_ml_transform(Client, Input)
Updates an existing machine learning transform.
Call this operation to tune the algorithm parameters to achieve better results.
After calling this operation, you can call theStartMLEvaluationTaskRun
operation to assess how well your new parameters achieved your goals (such
as improving the quality of your machine learning transform, or making it
more cost-effective).
update_ml_transform(Client, Input, Options)
update_partition(Client, Input)
Updates a partition.
update_partition(Client, Input, Options)
update_registry(Client, Input)
Updates an existing registry which is used to hold a collection of schemas.
The updated properties relate to the registry, and do not modify any of the schemas within the registry.update_registry(Client, Input, Options)
update_schema(Client, Input)
Updates the description, compatibility setting, or version checkpoint for a schema set.
For updating the compatibility setting, the call will not validate
compatibility for the entire set of schema versions with the new
compatibility setting. If the value for Compatibility
is provided, the
VersionNumber
(a checkpoint) is also required. The API will validate the
checkpoint version number for consistency.
If the value for the VersionNumber
(checkpoint) is provided,
Compatibility
is optional and this can be used to set/reset a checkpoint
for the schema.
update_schema(Client, Input, Options)
update_table(Client, Input)
Updates a metadata table in the Data Catalog.
update_table(Client, Input, Options)
update_trigger(Client, Input)
Updates a trigger definition.
update_trigger(Client, Input, Options)
update_user_defined_function(Client, Input)
Updates an existing function definition in the Data Catalog.
update_user_defined_function(Client, Input, Options)
update_workflow(Client, Input)
Updates an existing workflow.