EntityRecognizer

classarcgis.learn.text.EntityRecognizer(data=None,lang='en',backbone='spacy',**kwargs)

Creates an entity recognition model to extract text entities from unstructured text documents.

To load a custom DLPK using the model extensibility support, instantiate an object of the class usingfrom_model.

Parameter	Description
data	Optional data object returned from`prepare_textdata()` function.data object can beNone, in case where someone wants to use aHugging Face Transformer model fine-tuned on entity-recognitiontask. In this case the model should be used directly for inference.
lang	Optional string. Language-specific code,named according to the language’sISO codeThe default value is ‘en’ for English.
backbone	Optional string. Specifyspacy,mistral or the HuggingFacetransformer model name to be used to train theentity recognizer model. Default set tospacy. Entity recognition viaspaCy is based on <https://spacy.io/api/entityrecognizer> To learn more about the available transformer models orchoose models that are suitable for your dataset,kindly visit:-https://huggingface.co/transformers/pretrained_models.html To learn more about the available transformer models fine-tunedon Named Entity Recognition Task, kindly visit:-https://huggingface.co/models?pipeline_tag=token-classification To learn more about mistral, kindly visit:https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.2

kwargs

Parameter	Description
verbose	Optional string. Default set toerror. Thelog level you want to set. It means the amountof information you want to display while trainingor calling the various methods of this class.Allowed values are -debug,info,warning,error andcritical. Applicable only for modelswith HuggingFace transformer backbones.
seq_len	Optional Integer. Default set to 512. Maximumsequence length (at sub-word level after tokenization)of the training data to be considered for trainingthe model. Applicable only for models withHuggingFace transformer backbones.
mixed_precision	Optional Bool. Default set to False. If setTrue, then mixed precision training is usedto train the model. Applicable only for modelswith HuggingFace transformer backbones.
pretrained_path	Optional String. Path where pre-trained modelis saved. Accepts a Deep Learning Package(DLPK) or Esri Model Definition(EMD) file.
prompt	Optional String. This parameter is applicable if the selected model backbone is from theLLM family. This parameter outlines the task and its corresponding guardrails.
examples	Optional List. The list comprises tuple(s) where the first element denotes the text forentity extraction, while the second element is a dictionary used for mapping named entities. This parameter is applicable if the selected model backbone is from the LLM family. Pydantic Schema: List[Tuple[str, Dict[str, List]]] Example: [(“Jim stays in London”, {“name”: [“Jim”], “location”: [“London”]})] If examples are not supplied, a data object must be provided.

Returns:: EntityRecognizer Object

classmethodavailable_backbone_models(architecture)

Get available models for the given entity recognition backbone

Parameter

Description

architecture

Required string. name of the architecture or ‘llm’one wishes to use.

To learn more aboutthe available models or choose models that aresuitable for your dataset, kindly visit:-https://huggingface.co/transformers/pretrained_models.html

To learn more aboutllm and mistralhttps://huggingface.co/mistralai/Mistral-7B-Instruct-v0.2

Returns:: a tuple containing the available models for the given entity recognition backbone

propertyavailable_metrics: List of available metrics that are displayed in the trainingtable. Setmonitor value to be one of these while callingthefit method.

extract_entities(text_list,drop=True,batch_size=4,show_progress=True,**kwargs)→DataFrame|FeatureSet

Extracts the entities from [documents in the mentioned path or text_list].

Field defined as ‘address_tag’ inprepare_data() function’s class mappingattribute will be treated as a location. In cases where trained model extractsmultiple locations from a single document, that document will be replicatedfor each location in the resulting dataframe.

Parameter	Description
text_list	Required string(path) or list(documents).List of documents for entity extraction ORpath to the documents.
drop	Optional bool. If documents without addressneeds to be dropped from the results.Default is set to True.
batch_size	Optional integer. Number of items to processat once. (Reduce it if getting CUDA Out of MemoryErrors). Default is set to 4.Not applicable for models withspaCy backbone.
show_progress	Optional Bool. If set to True, will display aprogress bar depicting the items processed so far.Applicable only when a list of text is passed.

kwargs

Parameter	Description
input_field	Optional string.Input field name in the feature set. Supportedin model extension.Default value: input_str

Returns:: Pandas DataFrame

f1_score(): Calculate F1 score of the trained model

fit(epochs=20,lr=None,one_cycle=True,early_stopping=False,checkpoint=True,**kwargs)

Train the model for the specified number of epochs and using thespecified learning rates

This method is not supported when the backbone is configured as llm/mistral.

Parameter	Description
epochs	Required integer. Number of cycles of trainingon the data. Increase it if underfitting.
lr	Optional float or slice of floats. Learning rateto be used for training the model. If`lr=None`,an optimal learning rate is automatically deducedfor training the model. Note Passing slice of floats aslr value is not supported for models withspaCy backbone.
one_cycle	Optional boolean. Parameter to select 1cyclelearning rate schedule. If set toFalse nolearning rate schedule is used. Note Not applicable for models with spaCy backbone
early_stopping	Optional boolean. Parameter to add early stopping.If set to ‘True’ training will stop if parametermonitor value stops improving for 5 epochs. Note Not applicable for models with spaCy backbone
checkpoint	Optional boolean or string.Parameter to save checkpoint during training.If set toTrue the best modelbased onmonitor will be saved duringtraining. If set to ‘all’, all checkpointsare saved. If set to False, checkpointing willbe off. Setting this parameter loads the bestmodel at the end of training. Note Not applicable for models with spaCy backbone
tensorboard	Optional boolean. Parameter to write the training log.If set to ‘True’ the log will be saved at<dataset-path>/training_log which can be visualized intensorboard. Required tensorboardx version=2.1 The default value is ‘False’. Note Not applicable for Text Models
monitor	Optional string. Parameter specifieswhich metric to monitor while checkpointingand early stopping. Defaults to ‘valid_loss’. Valueshould be one of the metric that is displayed inthe training table. Use{model_name}.available_metricsto list the available metrics to set here. Note Not applicable for models with spaCy backbone

freeze()

Freeze up to last layer group to train only the last layer group of the model.

This method is not supported when the backbone is configured as llm/mistral.

classmethodfrom_model(emd_path,data=None,**kwargs)

Creates an EntityRecognizer model object from a Deep LearningPackage(DLPK) or Esri Model Definition (EMD) file.

To load a custom DLPK using the model extensibility support, instantiate an object of the class using thismethod.

Returns:: EntityRecognizer Object

classmethodfrom_pretrained(backbone,**kwargs)

Creates an EntityRecognizer model object from an already fine-tunedHugging Face Transformer backbone.

This method is not supported when the backbone is configured as llm/mistral.

To load a custom DLPK using the model extensibility support, instantiate an object of the class usingfrom_model.

Parameter

Description

backbone

Required string. Specify the Hugging Face Transformerbackbone name fine-tuned on Named Entity Recognition(NER)/Token Classification task.

To get more details on available transformer modelsfine-tuned on Named Entity Recognition(NER) Task, kindly visit:-https://huggingface.co/models?pipeline_tag=token-classification

kwargs

Parameter	Description
verbose	Optional string. Default set toerror. Thelog level you want to set. It means the amountof information you want to display while callingthe various methods of this class. Allowed valuesare -debug,info,warning,error andcritical.

Returns:: EntityRecognizer Object

load(name_or_path)

Loads a saved EntityRecognizer model from disk.

This method is not supported when the backbone is configured as llm/mistral.

To load a custom DLPK using the model extensibility support, instantiate an object of the class usingfrom_model.

Parameter	Description
name_or_path	Required string. Path to Deep Learning Package(DLPK) or Esri Model Definition(EMD) file.

lr_find(allow_plot=True,**kwargs)

Runs the Learning Rate Finder. Helps in choosing theoptimum learning rate for training the model.

This method is not supported when the backbone is configured as llm/mistral.

Parameter	Description
allow_plot	Optional boolean. Display the plot of lossesagainst the learning rates and mark the optimalvalue of the learning rate on the plot.The default value is ‘True’.

metrics_per_label(): Calculate precision, recall & F1 scores per labels/entitiesfor which the model was trained on

plot_losses(show=True)

Plot training and validation losses.

This method is not supported when the backbone is configured as llm/mistral.

Parameter	Description
show	Optional bool. Defaults to TrueIf set to False, figure will not be plottedbut will be returned, when set to True functionwill plot the figure and return nothing.

Returns:: matplotlib.figure.Figure

precision_score(): Calculate precision score of the trained model

recall_score(): Calculate recall score of the trained model

save(name_or_path,**kwargs)

Saves the model weights, creates an Esri Model Definition and DeepLearning Package zip for deployment to Image Server or ArcGIS Pro.

Parameter	Description
name_or_path	Required string. Name of the model to save. Itstores it at the pre-defined location. If pathis passed then it stores at the specified pathwith model name as directory name and createsall the intermediate directories.
publish	Optional boolean. Publishes the DLPK as an item.Default is set to False.
gis	Optional`GIS` Object. Used for publishing the item.If not specified then active gis user is taken.
compute_metrics	Optional boolean. Used for computing modelmetrics. Default is set to True.
save_optimizer	Optional boolean. Used for saving the model-optimizerstate along with the model. Default is set to FalseNot applicable for models withspaCy backbone.

kwargs

Parameter	Description
overwrite	Optional booleanoverwrite if True, it will overwritethe item on ArcGIS Online/Enterprise, default False.
zip_files	Optional booleanzip_files if True, it will create the DeepLearning Package (DLPK) file while saving the model.

show_results(rows=5,ds_type='valid')

Runs entity extraction on a random batch from the mentioned ds_type.

Parameter	Description
ds_type	Optional string, defaults to valid.
rows	Optional integer, defaults to 5.Number of rows to print.

Returns:: Pandas DataFrame

supported_backbones=['spacy','BERT','RoBERTa','DistilBERT','ALBERT','CamemBERT','MobileBERT','XLNet','XLM','XLM-RoBERTa','FlauBERT','ELECTRA','Longformer','Funnel','LLM']

unfreeze()

Unfreezes the earlier layers of the model for fine-tuning.

This method is not supported when the backbone is configured as llm/mistral.

TextClassifier

classarcgis.learn.text.TextClassifier(data,backbone='bert-base-cased',**kwargs)

Creates aTextClassifier Object.

To load a custom DLPK using the model extensibility support, instantiate an object of the class usingfrom_model.

Based on the Hugging Face transformers library

Parameter

Description

data

Optional data object returned fromprepare_textdata function.data object can beNone, in case where someone wants to use aHugging Face Transformer model fine-tuned on classification task.In this case the model should be used directly for inference.

backbone

Optional string. Specifygpt or the HuggingFacetransformer model name to be used to train theclassifier. Default set tobert-base-cased.

To learn more about the available models orchoose models that are suitable for your dataset,kindly visit:-https://huggingface.co/transformers/pretrained_models.html

To learn more about the available transformer models fine-tunedon Text Classification Task, kindly visit:-https://huggingface.co/models?pipeline_tag=text-classification

To learn more about mistral, kindly visit:https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.2

kwargs

Parameter	Description
verbose	Optional string. Default set toerror. Thelog level you want to set. It means the amountof information you want to display while trainingor calling the various methods of this class.Allowed values are -debug,info,warning,error andcritical.
seq_len	Optional Integer. Default set to 512. Maximumsequence length (at sub-word level after tokenization)of the training data to be considered for trainingthe model.
thresh	Optional Float. This parameter is used to setthe threshold value to pick labels in case ofmulti-label text classification problem. Defaultvalue is set to 0.25
mixed_precision	Optional Bool. Default set to False. If setTrue, then mixed precision training is usedto train the model
pretrained_path	Optional String. Path where pre-trained modelis saved. Accepts a Deep Learning Package(DLPK) or Esri Model Definition(EMD) file.
prompt	Optional String. This parameter is applicable if the selected model backbone is from theLLM family. This parameter outlines the task and its corresponding guardrails.
examples	Optional dictionary. The dictionary’s keys represent labels or classes, with thecorresponding values being lists of sentences belonging to each class. This parameter is applicable if the selected model backbone is from the LLM family. Pydantic notation Optional[Dict[str, List]] Example: { “Label_1” :[example 1, example 2], “Label_2” : [example 1, example 2] } If examples are not supplied, a data object must be provided.

Returns:: TextClassifier Object

accuracy()

Calculates the following metric:

accuracy: the number of correctly predicted labels in the validation set divided by the total number of items in the validation set

Returns:: a floating point number depicting the accuracy of the classification model.

classmethodavailable_backbone_models(architecture)

Get available models for the given transformer backbone

Parameter

Description

architecture

Required string. name of the transformer orllmbackbone one wish to use.

To learn more aboutthe available models or choose models that aresuitable for your dataset, kindly visit:-https://huggingface.co/transformers/pretrained_models.html

To learn more about mistral, kindly visit:https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.2

Returns:: a tuple containing the available models for the given transformer backbone

propertyavailable_metrics: List of available metrics that are displayed in the trainingtable. Setmonitor value to be one of these while callingthefit method.

fit(epochs=10,lr=None,one_cycle=True,early_stopping=False,checkpoint=True,tensorboard=False,monitor='valid_loss',**kwargs)

Train the model for the specified number of epochs and using thespecified learning rates.

This method is not supported when the backbone is configured as llm/mistral.

Parameter	Description
epochs	Required integer. Number of cycles of trainingon the data. Increase it if underfitting.
lr	Optional float or slice of floats. Learning rateto be used for training the model. If`lr=None`,an optimal learning rate is automatically deducedfor training the model.
one_cycle	Optional boolean. Parameter to select 1cyclelearning rate schedule. If set toFalse nolearning rate schedule is used.
early_stopping	Optional boolean. Parameter to add early stopping.If set to ‘True’ training will stop if parametermonitor value stops improving for 5 epochs.A minimum difference of 0.001 is required forit to be considered an improvement.
checkpoint	Optional boolean or string.Parameter to save checkpoint during training.If set toTrue the best modelbased onmonitor will be saved duringtraining. If set to ‘all’, all checkpointsare saved. If set to False, checkpointing willbe off. Setting this parameter loads the bestmodel at the end of training.
tensorboard	Optional boolean. Parameter to write the training log.If set to ‘True’ the log will be saved at<dataset-path>/training_log which can be visualized intensorboard. Required tensorboardx version=2.1 The default value is ‘False’. Note Not applicable for Text Models
monitor	Optional string. Parameter specifieswhich metric to monitor while checkpointingand early stopping. Defaults to ‘valid_loss’. Valueshould be one of the metric that is displayed inthe training table. Use{model_name}.available_metricsto list the available metrics to set here.

freeze()

Freeze up to last layer group to train only the last layer group of the model.

This method is not supported when the backbone is configured as llm/mistral.

classmethodfrom_model(emd_path,data=None,**kwargs)

Creates an TextClassifier model object from a Deep LearningPackage(DLPK) or Esri Model Definition (EMD) file.

To load a custom DLPK using the model extensibility support, instantiate an object of the class using this method.

Returns:: TextClassifier model Object

classmethodfrom_pretrained(backbone,**kwargs)

Creates an TextClassifier model object from an already fine-tunedHugging Face Transformer backbone.

This method is not supported when the backbone is configured as llm/mistral.

Parameter

Description

backbone

Required string. Specify the Hugging Face Transformerbackbone name fine-tuned on Text Classification task.

To get more details on available transformer modelsfine-tuned on Text Classification Task, kindly visit:-https://huggingface.co/models?pipeline_tag=text-classification

Returns:: TextClassifier Object

get_misclassified_records()

This method is not supported when the backbone is configured as llm/mistral.

Returns:: get misclassified records for this classification model.

load(name_or_path)

Loads a saved TextClassifier model from disk.

This method is not supported when the backbone is configured as llm/mistral and model extension.

To load a custom DLPK using the model extensibility support, instantiate an object of the class usingfrom_model.

Parameter	Description
name_or_path	Required string. Path to Deep Learning Package(DLPK) or Esri Model Definition(EMD) file.

lr_find(allow_plot=True,**kwargs)

Runs the Learning Rate Finder. Helps in choosing theoptimum learning rate for training the model.

This method is not supported when the backbone is configured as llm/mistral.

Parameter	Description
allow_plot	Optional boolean. Display the plot of lossesagainst the learning rates and mark the optimalvalue of the learning rate on the plot.The default value is ‘True’.

metrics_per_label()

Returns:: precision, recall and f1 score for each label in the classification model.

plot_losses()

Plot validation and training losses after fitting the model.

This method is not supported when the backbone is configured as llm/mistral.

predict(text_or_list,show_progress=True,thresh=None,explain=False,explain_index=None,batch_size=64,**kwargs)→List[Tuple]|FeatureSet

Predicts the class label(s) for the input text

Parameter	Description
text_or_list	Required String or List. text or a list oftexts for which we wish to find the class label(s).
prompt	Optional String. This parameter is applicable if the selected model backbone is from theLLM family. This parameter use to describe the task and guardrails for the task.
show_progress	Optional Bool. If set to True, will display aprogress bar depicting the items processed so far.Applicable only when a list of text is passed
thresh	Optional Float. The threshold value set to getthe class label(s). Applicable only for multi-labelclassification task. Default is the value setduring the model creation time, otherwise the valueof 0.25 is set.
explain	Optional Bool. If set to True it shall generate SHAPbased explanation. Kindly visit:-https://shap.readthedocs.io/en/latest/
explain_index	Optional List. Index of the rows for which explanationis required. If the value is None, it will generatean explanation for every row.
batch_size	Optional integer.Number of inputs to be processed at once.Try reducing the batch size in case of out ofmemory errors.Default value : 64

kwargs

Parameter	Description
input_field	Optional string.Input field name in the feature set. Supportedin model extension.Default value: input_str

Returns:

In case of single label classification problem, a tuple containing the text, its predicted class label and the confidence score.
In case of multi label classification problem, a tuple containing the text, its predicted class labels, a list containing 1’s for the predicted labels, 0’s otherwise and list containing a score for each label

save(name_or_path,framework='PyTorch',publish=False,gis=None,compute_metrics=True,save_optimizer=False,**kwargs)

Saves the model weights, creates an Esri Model Definition and DeepLearning Package zip for deployment.

Parameter	Description
name_or_path	Required string. Folder path to save the model.
framework	Optional string. Defines the framework of themodel. (Only supported by`SingleShotDetector`, currently.)If framework used is`TF-ONNX`,`batch_size` can bepassed as an optional keyword argument. Framework choice: ‘PyTorch’ and ‘TF-ONNX’
publish	Optional boolean. Publishes the DLPK as an item.
gis	Optional`GIS` Object. Used for publishing the item.If not specified then active gis user is taken.
compute_metrics	Optional boolean. Used for computing modelmetrics.
save_optimizer	Optional boolean. Used for saving the model-optimizerstate along with the model. Default is set to False.

kwargs

Parameter	Description
overwrite	Optional booleanoverwrite if True, it will overwritethe item on ArcGIS Online/Enterprise, default False.
zip_files	Optional booleanzip_files if True, it will create the DeepLearning Package (DLPK) file while saving the model.

Returns:: the qualified path at which the model is saved

show_results(rows=5,**kwargs)

Prints the rows of the dataframe with target and prediction columns.

Parameter	Description
rows	Optional Integer.Number of rows to print.

Returns:: dataframe

supported_backbones=['BERT','RoBERTa','DistilBERT','ALBERT','FlauBERT','CamemBERT','XLNet','XLM','XLM-RoBERTa','Bart','ELECTRA','Longformer','MobileBERT','Funnel','LLM']

unfreeze()

Unfreezes the earlier layers of the model for fine-tuning.

This method is not supported when the backbone is configured as llm/mistral.

SequenceToSequence

classarcgis.learn.text.SequenceToSequence(data,backbone='t5-base',**kwargs)

Creates aSequenceToSequence Object.Based on the Hugging Face transformers library

To load a custom DLPK using the model extensibility support, instantiate an object of the class usingfrom_model.

Parameter

Description

data

Required text data object, returned fromprepare_textdata function.

backbone

Optional string. Specifying the HuggingFacetransformer model name to be used to train themodel. Default set to ‘t5-base’.

To learn more about the available models orchoose models that are suitable for your dataset,kindly visit:-https://huggingface.co/transformers/pretrained_models.html

To learn more about mistralhttps://huggingface.co/mistralai/Mistral-7B-Instruct-v0.2

kwargs

Parameter	Description
verbose	Optional string. Default set toerror. Thelog level you want to set. It means the amountof information you want to display while trainingor calling the various methods of this class.Allowed values are -debug,info,warning,error andcritical.
seq_len	Optional Integer. Default set to 512. Maximumsequence length (at sub-word level after tokenization)of the training data to be considered for trainingthe model.
mixed_precision	Optional Bool. Default set to False. If setTrue, then mixed precision training is usedto train the model
pretrained_path	Optional String. Path where pre-trained modelis saved. Accepts a Deep Learning Package(DLPK) or Esri Model Definition(EMD) file.
prompt	Optional String. This parameter is applicable if the selected model backbone is from theLLM family. This parameter used to describe the task and guardrails for the task.
examples	Optional List of Tuples. It contains List of tuples. Each of the tuples has two elementswhich represents input and target sentence This parameter is applicable if the selected model backbone is from the LLM family. Pydantic notation Optional[List[List[str, str]]] Example: [[“input_1”, “output_1”] , [“input_2”, “output_2”] ] If examples are not supplied, a data object must be provided.

Returns:: SequenceToSequence model object for sequence_translation task.

classmethodavailable_backbone_models(architecture)

Get available models for the given transformer backbone

Parameter

Description

architecture

Required string. name of the transformerbackbone one wish to use. To learn more aboutthe available models or choose models that aresuitable for your dataset, kindly visit:-https://huggingface.co/transformers/pretrained_models.html

To learn more about mistralhttps://huggingface.co/mistralai/Mistral-7B-Instruct-v0.2

Returns:: a tuple containing the available models for the given transformer backbone

propertyavailable_metrics: List of available metrics that are displayed in the trainingtable. Setmonitor value to be one of these while callingthefit method.

fit(epochs=10,lr=None,one_cycle=True,early_stopping=False,checkpoint=True,tensorboard=False,monitor='valid_loss',**kwargs)

Train the model for the specified number of epochs and using thespecified learning rates.

This method is not supported when the backbone is configured as llm/mistral.

Parameter	Description
epochs	Required integer. Number of cycles of trainingon the data. Increase it if underfitting.
lr	Optional float or slice of floats. Learning rateto be used for training the model. If`lr=None`,an optimal learning rate is automatically deducedfor training the model.
one_cycle	Optional boolean. Parameter to select 1cyclelearning rate schedule. If set toFalse nolearning rate schedule is used.
early_stopping	Optional boolean. Parameter to add early stopping.If set to ‘True’ training will stop if parametermonitor value stops improving for 5 epochs.A minimum difference of 0.001 is required forit to be considered an improvement.
checkpoint	Optional boolean or string.Parameter to save checkpoint during training.If set toTrue the best modelbased onmonitor will be saved duringtraining. If set to ‘all’, all checkpointsare saved. If set to False, checkpointing willbe off. Setting this parameter loads the bestmodel at the end of training.
tensorboard	Optional boolean. Parameter to write the training log.If set to ‘True’ the log will be saved at<dataset-path>/training_log which can be visualized intensorboard. Required tensorboardx version=2.1 The default value is ‘False’. Note Not applicable for Text Models
monitor	Optional string. Parameter specifieswhich metric to monitor while checkpointingand early stopping. Defaults to ‘valid_loss’. Valueshould be one of the metric that is displayed inthe training table. Use{model_name}.available_metricsto list the available metrics to set here.

freeze()

Freeze up to last layer group to train only the last layer group of the model.

This method is not supported when the backbone is configured as llm/mistral.

classmethodfrom_model(emd_path,data=None,**kwargs)

Creates an SequenceToSequence model object from a Deep LearningPackage(DLPK) or Esri Model Definition (EMD) file.

To load a custom DLPK using the model extensibility support, instantiate an object of the class using this method.

Parameter	Description
emd_path	Required string. Path to Deep Learning Package(DLPK) or Esri Model Definition(EMD) file.
data	Optional fastai Databunch. Returned dataobject from`prepare_textdata` function or None forinferencing.Default value: None

Returns:: SequenceToSequence Object

get_model_metrics()

Calculates the following metrics:

accuracy: the number of correctly predicted labels in the validation set divided by the total number of items in the validation set
bleu-score This value indicates the similarity between model predictions and the ground truth text. Maximum value is 1

Returns:: a dictionary containing the metrics for classification model.

load(name_or_path)

Loads a saved SequenceToSequence model from disk.

This method is not supported when the backbone is configured as llm/mistral.

To load a custom DLPK using the model extensibility support, instantiate an object of the class usingfrom_model.

Parameter	Description
name_or_path	Required string. Path to Deep Learning Package(DLPK) or Esri Model Definition(EMD) file.

lr_find(allow_plot=True,**kwargs)

Runs the Learning Rate Finder. Helps in choosing theoptimum learning rate for training the model.

This method is not supported when the backbone is configured as llm/mistral.

Parameter	Description
allow_plot	Optional boolean. Display the plot of lossesagainst the learning rates and mark the optimalvalue of the learning rate on the plot.The default value is ‘True’.

plot_losses(show=True)

Plot training and validation losses.

This method is not supported when the backbone is configured as llm/mistral.

Parameter	Description
show	Optional bool. Defaults to TrueIf set to False, figure will not be plottedbut will be returned, when set to True functionwill plot the figure and return nothing.

Returns:

matplotlib.figure.Figure

predict(text_or_list,batch_size=64,show_progress=True,explain=False,explain_index=None,**kwargs)→List[Tuple]|FeatureSet

Predicts the translated outcome.

Parameter	Description
text_or_list	Required input string or list of input strings.
batch_size	Optional integer.Number of inputs to be processed at once.Try reducing the batch size in case of out ofmemory errors.Default value : 64
show_progress	Optional bool.To show or not to show the progress of prediction task.Default value : True
explain	Optional bool.To enable shap based importanceDefault value : False
explain_index	Optional list.Index of the input rows for which the importance score willbe generatedDefault value : None

kwargs

Parameter	Description
num_beams	Optional integer.Number of beams for beam search. 1 means no beam search.Default value is set to 1.
max_length	Optional integer.The maximum length of the sequence to be generated.Default value is set to 20.
min_length	Optional integer.The minimum length of the sequence to be generated.Default value is set to 10.
input_field	Optional string.Input field name in the feature set. Supportedin model extension.Default value: input_str

Returns:: list of tuples(input , predicted output strings) or FeatureSet.

save(name_or_path,framework='PyTorch',publish=False,gis=None,compute_metrics=True,save_optimizer=False,**kwargs)

Saves the model weights, creates an Esri Model Definition and DeepLearning Package zip for deployment.

Parameter	Description
name_or_path	Required string. Folder path to save the model.
framework	Optional string. Defines the framework of themodel. (Only supported by`SingleShotDetector`, currently.)If framework used is`TF-ONNX`,`batch_size` can bepassed as an optional keyword argument. Framework choice: ‘PyTorch’ and ‘TF-ONNX’
publish	Optional boolean. Publishes the DLPK as an item.
gis	Optional`GIS` Object. Used for publishing the item.If not specified then active gis user is taken.
compute_metrics	Optional boolean. Used for computing modelmetrics.
save_optimizer	Optional boolean. Used for saving the model-optimizerstate along with the model. Default is set to False.

kwargs

Parameter	Description
overwrite	Optional booleanoverwrite if True, it will overwritethe item on ArcGIS Online/Enterprise, default False.
zip_files	Optional booleanzip_files if True, it will create the DeepLearning Package (DLPK) file while saving the model.

Returns:: the qualified path at which the model is saved

show_results(rows=5,**kwargs)

Prints the rows of the dataframe with target and prediction columns.

Parameter	Description
rows	Optional Integer.Number of rows to print.

Returns:: dataframe

supported_backbones=['T5','Bart','Marian','LLM']

unfreeze()

Unfreezes the earlier layers of the model for fine-tuning.

This method is not supported when the backbone is configured as llm/mistral.

Inference Only Models

FillMask

classarcgis.learn.text.FillMask(backbone=None,**kwargs)

Creates aFillMask Object.Based on the Hugging Face transformers library

Parameter

Description

backbone

Optional string. Specify the HuggingFacetransformer model name which will be used togenerate the suggestion token.

To learn more about the available models forfill-mask task, kindly visit:-https://huggingface.co/models?pipeline_tag=fill-mask

kwargs

Parameter	Description
pretrained_path	Option str. Path to a directory, where pretrainedmodel files are saved.If pretrained_path is provided, the model isloaded from that path on the local disk.
working_dir	Option str. Path to a directory on local filesystem.If directory is not present, it will be created.This directory is used as the location to save themodel.

Returns:: FillMask Object

classmethodfrom_model(emd_path,**kwargs)

Creates anSequenceToSequence model object from anEsri Model Definition (EMD) file.

Parameter	Description
emd_path	Required string. Path toEsri Model Definition(EMD) file or the folderwith saved model files.

Returns:: SequenceToSequence Object

predict_token(text_or_list,num_suggestions=5,show_progress=True)

Summarize the given text or list of text

Parameter	Description
text_or_list	Required string or list. A text/sentenceor a list of texts/sentences for which on wishesto generate the recommendations for masked-token.
num_suggestions	Optional Integer. The number of suggestions toreturn. The maximum number of suggestion thatcan be generated for amissing-token is 10.
show_progress	optional Bool. If set to True, will display aprogress bar depicting the items processed so far.

Returns:

A list or a list of list ofdict: Each result comes as list of dictionaries with the following keys:

sequence (str) – The corresponding input with the mask token prediction.
score (float) – The corresponding probability.
token_str (str) – The predicted token (to replace the masked one).

save(name_or_path)

Saves the translator model files on a specified path on the local disk.

Parameter	Description
name_or_path	Required string. Path to savemodel files on the local disk.

Returns:: Absolute path for the saved model

supported_backbones=['Albert','Bart','Bert','BigBird','Camembert','ConvBert','Data2VecText','Deberta','DebertaV2','DistilBert','Electra','Ernie','Esm','Flaubert','FNet','Funnel','IBert','LayoutLM','Longformer','Luke','MBart','Mega','MegatronBert','MobileBert','MPNet','Mra','Mvp','Nezha','Nystromformer','Perceiver','QDQBert','Reformer','RemBert','Roberta','RobertaPreLayerNorm','RoCBert','RoFormer','SqueezeBert','Tapas','Wav2Vec2','XLM','XLMRoberta','XLMRobertaXL','Xmod','Yoso']: supported transformer architectures

QuestionAnswering

classarcgis.learn.text.QuestionAnswering(backbone=None,**kwargs)

Creates aQuestionAnswering Object.Based on the Hugging Face transformers library

Parameter

Description

backbone

Optional string. Specify the HuggingFacetransformer model name which will be used toextract the answers from a given passage/context.

To learn more about the available models forquestion-answering task, kindly visit:-https://huggingface.co/models?pipeline_tag=question-answering

kwargs

Parameter	Description
pretrained_path	Option str. Path to a directory, where pretrainedmodel files are saved.If pretrained_path is provided, the model isloaded from that path on the local disk.
working_dir	Option str. Path to a directory on local filesystem.If directory is not present, it will be created.This directory is used as the location to save themodel.

Returns:: QuestionAnswering Object

classmethodfrom_model(emd_path,**kwargs)

Creates anSequenceToSequence model object from anEsri Model Definition (EMD) file.

Parameter	Description
emd_path	Required string. Path toEsri Model Definition(EMD) file or the folderwith saved model files.

Returns:: SequenceToSequence Object

get_answer(text_or_list,context,show_progress=True,explain=False,explain_start_word=True,explain_index=None,**kwargs)

Find answers for the asked questions from the given passage/context

Parameter	Description
text_or_list	Required string or list. Questions or a listof questions one wishes to seek an answer for.
context	Required string. The context associated withthe question(s) which contains the answers.
show_progress	optional Bool. If set to True, will display aprogress bar depicting the items processed so far.
explain	optional Bool. If set to True, will generatea shap based explanation
explain_start_word	optional Bool.E.g. Context: Point cloud datasets are typicallycollected using Lidar sensors (light detection and ranging )Question: “How is Point cloud dataset collected?”Answer: Lidar Sensors If set to True, will generatea shap based explanation for start word. if setto False, will generate explanation for last wordof the answer. In the above example, if the value ofexplain_start_wordisTrue, it will generate the importance of different contextwords that leads to selection of “Lidar” as a starting wordof the span. Ifexplain_start_word is set toFalsethen it will generate explanation for the wordsensors
explain_index	optional List. Index of the question for which answerneeds to be generated

kwargs

Parameter	Description
num_answers	Optional integer. The number of answers toreturn. The answers will be chosen by orderof likelihood.Default value is set to 1.
max_answer_length	Optional integer. The maximum length of thepredicted answers.Default value is set to 15.
max_question_length	Optional integer. The maximum length of thequestion after tokenization. Questions will betruncated if needed.Default value is set to 64.
impossible_answer	Optional bool. Whether or not we accept impossibleas an answer.Default value is set to False

Returns:: a list or a list of list containing the answer(s) for the input question(s)

save(name_or_path)

Saves the translator model files on a specified path on the local disk.

Parameter	Description
name_or_path	Required string. Path to savemodel files on the local disk.

Returns:: Absolute path for the saved model

supported_backbones=['Albert','Bart','Bert','BigBird','BigBirdPegasus','Bloom','Camembert','Canine','ConvBert','Data2VecText','Deberta','DebertaV2','DistilBert','Electra','Ernie','ErnieM','Falcon','Flaubert','FNet','Funnel','GPT2','GPTNeo','GPTNeoX','GPTJ','IBert','LayoutLMv2','LayoutLMv3','LED','Lilt','Longformer','Luke','Lxmert','MarkupLM','MBart','Mega','MegatronBert','MobileBert','MPNet','Mpt','Mra','MT5','Mvp','Nezha','Nystromformer','OPT','QDQBert','Reformer','RemBert','Roberta','RobertaPreLayerNorm','RoCBert','RoFormer','Splinter','SqueezeBert','T5','UMT5','XLM','XLMRoberta','XLMRobertaXL','XLNet','Xmod','Yoso']: supported transformer architectures

TextGenerator

classarcgis.learn.text.TextGenerator(backbone=None,**kwargs)

Creates aTextGenerator Object.Based on the Hugging Face transformers library

Parameter

Description

backbone

Optional string. Specifying the HuggingFacetransformer model name which will be used togenerate the text.

To learn more about the available models fortext-generation task, kindly visit:-https://huggingface.co/models?pipeline_tag=text-generation

kwargs

Parameter	Description
pretrained_path	Option str. Path to a directory, where pretrainedmodel files are saved.If pretrained_path is provided, the model isloaded from that path on the local disk.
working_dir	Option str. Path to a directory on local filesystem.If directory is not present, it will be created.This directory is used as the location to save themodel.

Returns:: TextGenerator Object

classmethodfrom_model(emd_path,**kwargs)

Creates anSequenceToSequence model object from anEsri Model Definition (EMD) file.

Parameter	Description
emd_path	Required string. Path toEsri Model Definition(EMD) file or the folderwith saved model files.

Returns:: SequenceToSequence Object

generate_text(text_or_list,show_progress=True,**kwargs)

Generate text(s) for a text or a list of incomplete sentence(s)

Parameter	Description
text_or_list	Required string or list. A text/sentenceor a list of texts/sentences to complete.
show_progress	optional Bool. If set to True, will display aprogress bar depicting the items processed so far.

kwargs

Parameter	Description
min_length	Optional integer. The minimum length of thesequence to be generated.Default value is set to tomin_length parameterof the model config.
max_length	Optional integer. The maximum length of thesequence to be generated.Default value is set tomax_length parameterof the model config.
num_return_sequences	Optional integer. The number of independentlycomputed returned sequences for each elementin the batch.Default value is set to 1.
num_beams	Optional integer. Number of beams for beamsearch. 1 means no beam search.Default value is set to 1.
length_penalty	Optional float. Exponential penalty to thelength. 1.0 means no penalty. Set to values < 1.0in order to encourage the model to generateshorter sequences, to a value > 1.0 in order toencourage the model to produce longer sequences.Default value is set to 1.0.
early_stopping	Optional bool. Whether to stop the beam searchwhen at least`num_beams` sentences arefinished per batch or not.Default value is set to False.

Returns:: a list or a list of list containing the generated text for the input prompt(s) / sentence(s)

save(name_or_path)

Saves the translator model files on a specified path on the local disk.

Parameter	Description
name_or_path	Required string. Path to savemodel files on the local disk.

Returns:: Absolute path for the saved model

supported_backbones=['Bart','Bert','BertGeneration','BigBird','BigBirdPegasus','BioGpt','Blenderbot','BlenderbotSmall','Bloom','Camembert','Llama','CodeGen','CpmAnt','CTRL','Data2VecText','Electra','Ernie','Falcon','Fuyu','Git','GPT2','GPT2','GPTBigCode','GPTNeo','GPTNeoX','GPTNeoXJapanese','GPTJ','Llama','Marian','MBart','Mega','MegatronBert','Mistral','Mixtral','Mpt','Musicgen','Mvp','OpenLlama','OpenAIGPT','OPT','Pegasus','Persimmon','Phi','PLBart','ProphetNet','QDQBert','Reformer','RemBert','Roberta','RobertaPreLayerNorm','RoCBert','RoFormer','Rwkv','Speech2Text2','TransfoXL','TrOCR','Whisper','XGLM','XLM','XLMProphetNet','XLMRoberta','XLMRobertaXL','XLNet','Xmod']: supported transformer architectures

TextSummarizer

classarcgis.learn.text.TextSummarizer(backbone=None,**kwargs)

Creates aTextSummarizer Object.Based on the Hugging Face transformers library

Parameter

Description

backbone

Optional string. Specify the HuggingFacetransformer model name which will be used tosummarize the text.

To learn more about the available models forsummarization task, kindly visit:-https://huggingface.co/models?pipeline_tag=summarization

kwargs

Parameter	Description
pretrained_path	Option str. Path to a directory, where pretrainedmodel files are saved.If pretrained_path is provided, the model isloaded from that path on the local disk.
working_dir	Option str. Path to a directory on local filesystem.If directory is not present, it will be created.This directory is used as the location to save themodel.

Returns:: TextSummarizer Object

classmethodfrom_model(emd_path,**kwargs)

Creates anSequenceToSequence model object from anEsri Model Definition (EMD) file.

Parameter	Description
emd_path	Required string. Path toEsri Model Definition(EMD) file or the folderwith saved model files.

Returns:: SequenceToSequence Object

save(name_or_path)

Saves the translator model files on a specified path on the local disk.

Parameter	Description
name_or_path	Required string. Path to savemodel files on the local disk.

Returns:: Absolute path for the saved model

summarize(text_or_list,show_progress=True,**kwargs)

Summarize the given text or list of text

Parameter	Description
text_or_list	Required string or list. A text/passageor a list of texts/passages to generate thesummary for.
show_progress	optional Bool. If set to True, will display aprogress bar depicting the items processed so far.

kwargs

Parameter	Description
min_length	Optional integer. The minimum length of thesequence to be generated.Default value is set to tomin_length parameterof the model config.
max_length	Optional integer. The maximum length of thesequence to be generated.Default value is set to tomax_length parameterof the model config.
num_return_sequences	Optional integer. The number of independentlycomputed returned sequences for each elementin the batch.Default value is set to 1.
num_beams	Optional integer. Number of beams for beamsearch. 1 means no beam search.Default value is set to 1.
length_penalty	Optional float. Exponential penalty to thelength. 1.0 means no penalty. Set to values < 1.0in order to encourage the model to generateshorter sequences, to a value > 1.0 in order toencourage the model to produce longer sequences.Default value is set to 1.0.
early_stopping	Optional bool. Whether to stop the beam searchwhen at least`num_beams` sentences arefinished per batch or not.Default value is set to False.

Returns:: a list or a list of list containing the summary/summaries for the input prompt(s) / sentence(s)

supported_backbones=['Bart','BigBirdPegasus','Blenderbot','BlenderbotSmall','EncoderDecoder','FSMT','GPTSanJapanese','LED','LongT5','M2M100','Marian','MBart','MT5','Mvp','NllbMoe','Pegasus','PegasusX','PLBart','ProphetNet','SeamlessM4T','SeamlessM4Tv2','SwitchTransformers','T5','UMT5','XLMProphetNet']: supported transformer architectures

TextTranslator

classarcgis.learn.text.TextTranslator(source_language='es',target_language='en',**kwargs)

Creates aTextTranslator Object.Based on the Hugging Face transformers libraryTo learn more about the available models for translation task,kindly visit:-https://huggingface.co/models?pipeline_tag=translation&search=Helsinki

Parameter	Description
source_language	Optional string. Specify the language of thetext you would like to get the translation of.Default value is ‘es’ (Spanish)
target_language	Optional string. The language into which onewishes to translate the input text.Default value is ‘en’ (English)

kwargs

Parameter	Description
pretrained_path	Option str. Path to a directory, where pretrainedmodel files are saved.If pretrained_path is provided, the model isloaded from that path on the local disk.
working_dir	Option str. Path to a directory on local filesystem.If directory is not present, it will be created.This directory is used as the location to save themodel.

Returns:: TextTranslator Object

classmethodfrom_model(emd_path,**kwargs)

Creates anSequenceToSequence model object from anEsri Model Definition (EMD) file.

Parameter	Description
emd_path	Required string. Path toEsri Model Definition(EMD) file or the folderwith saved model files.

Returns:: SequenceToSequence Object

save(name_or_path)

Saves the translator model files on a specified path on the local disk.

Parameter	Description
name_or_path	Required string. Path to savemodel files on the local disk.

Returns:: Absolute path for the saved model

supported_backbones=['MarianMT']: supported transformer architectures

translate(text_or_list,show_progress=True,**kwargs)

Translate the given text or list of text into the target language

Parameter	Description
text_or_list	Required string or list. A text/passageor a list of texts/passages to translate.
show_progress	optional Bool. If set to True, will display aprogress bar depicting the items processed so far.

kwargs

Parameter	Description
min_length	Optional integer. The minimum length of thesequence to be generated.Default value is set to tomin_length parameterof the model config.
max_length	Optional integer. The maximum length of thesequence to be generated.Default value is set to tomax_length parameterof the model config.
num_return_sequences	Optional integer. The number of independentlycomputed returned sequences for each elementin the batch.Default value is set to 1.
num_beams	Optional integer. Number of beams for beamsearch. 1 means no beam search.Default value is set to 1.
length_penalty	Optional float. Exponential penalty to thelength. 1.0 means no penalty. Set to values < 1.0in order to encourage the model to generateshorter sequences, to a value > 1.0 in order toencourage the model to produce longer sequences.Default value is set to 1.0.
early_stopping	Optional bool. Whether to stop the beam searchwhen at least`num_beams` sentences arefinished per batch or not.Default value is set to False.

Returns:: a list or a list of list containing the translation of the input prompt(s) / sentence(s) to the target language

ZeroShotClassifier

classarcgis.learn.text.ZeroShotClassifier(backbone=None,**kwargs)

Creates aZeroShotClassifier Object.Based on the Hugging Face transformers library

Parameter

Description

backbone

Optional string. Specifying the HuggingFacetransformer model name which will be used topredict the answers from a given passage/context.

To learn more about the available models forzero-shot-classification task, kindly visit:-https://huggingface.co/models?pipeline_tag=zero-shot-classification

kwargs

Parameter	Description
pretrained_path	Option str. Path to a directory, where pretrainedmodel files are saved.If pretrained_path is provided, the model isloaded from that path on the local disk.
working_dir	Option str. Path to a directory on local filesystem.If directory is not present, it will be created.This directory is used as the location to save themodel.

Returns:: ZeroShotClassifier Object

classmethodfrom_model(emd_path,**kwargs)

Creates anSequenceToSequence model object from anEsri Model Definition (EMD) file.

Parameter	Description
emd_path	Required string. Path toEsri Model Definition(EMD) file or the folderwith saved model files.

Returns:: SequenceToSequence Object

predict(text_or_list,candidate_labels,show_progress=True,**kwargs)

Predicts the class label(s) for the input text

Parameter	Description
text_or_list	Required string or list. The sequence or alist of sequences to classify.
candidate_labels	Required string or list. The set of possibleclass labels to classify each sequence into.Can be a single label, a string ofcomma-separated labels, or a list of labels.
show_progress	optional Bool. If set to True, will display aprogress bar depicting the items processed so far.

kwargs

Parameter	Description
multi_class	Optional boolean. Whether or not multiplecandidate labels can be true.Default value is set to False.
hypothesis	Optional string. The template used to turn eachlabel into an NLI-style hypothesis. This templatemust include a {} or similar syntax for thecandidate label to be inserted into the template.Default value is set to“This example is {}.”.

Returns:

a list ofdict: Each result comes as a dictionary with the following keys:

sequence (str) – The sequence for which this is the output.
labels (List[str]) – The labels sorted by order of likelihood.
scores (List[float]) – The probabilities for each of the labels.

save(name_or_path)

Saves the translator model files on a specified path on the local disk.

Parameter	Description
name_or_path	Required string. Path to savemodel files on the local disk.

Returns:: Absolute path for the saved model

supported_backbones=['Albert','Bart','Bert','BigBird','BigBirdPegasus','BioGpt','Bloom','Camembert','Canine','Llama','ConvBert','CTRL','Data2VecText','Deberta','DebertaV2','DistilBert','Electra','Ernie','ErnieM','Esm','Falcon','Flaubert','FNet','Funnel','GPT2','GPT2','GPTBigCode','GPTNeo','GPTNeoX','GPTJ','IBert','LayoutLM','LayoutLMv2','LayoutLMv3','LED','Lilt','Llama','Longformer','Luke','MarkupLM','MBart','Mega','MegatronBert','Mistral','Mixtral','MobileBert','MPNet','Mpt','Mra','MT5','Mvp','Nezha','Nystromformer','OpenLlama','OpenAIGPT','OPT','Perceiver','Persimmon','Phi','PLBart','QDQBert','Reformer','RemBert','Roberta','RobertaPreLayerNorm','RoCBert','RoFormer','SqueezeBert','T5','Tapas','TransfoXL','UMT5','XLM','XLMRoberta','XLMRobertaXL','XLNet','Xmod','Yoso']: supported transformer architectures

Was this page helpful?

Movatterモバイル変換

EntityRecognizer

TextClassifier

SequenceToSequence

Inference Only Models

FillMask

QuestionAnswering

TextGenerator

TextSummarizer

TextTranslator

ZeroShotClassifier