Error while trying to run Tensorflow backend in FeatureClassifier #1849

Daremitsu1 · 2024-06-20T10:10:32Z

Describe the bug
The bug is as follows:

The optimizer cannot recognize variable bn_Conv1/beta:0. This usually means you are trying to call the optimizer to update different parts of the model separately. Please call optimizer.build(variables)with the full list of trainable variables before the training loop or use legacy optimizertf.keras.optimizers.legacy.Adam.'`

To Reproduce
Steps to reproduce the behavior:

When running lr=model.find_lr() getting the above error i.e.,
`LR Finder is complete, type {learner_name}.recorder.plot() to see the graph.
Error during learning rate finding: 'The optimizer cannot recognize variable bn_Conv1/beta:0. This usually means you are trying to call the optimizer to update different parts of the model separately. Please call `optimizer.build(variables)` with the full list of trainable variables before the training loop or use legacy optimizer `tf.keras.optimizers.legacy.Adam.'`

error:

KeyError Traceback (most recent call last)
Cell In[8], line 1
----> 1 lr = model.lr_find()

File D:\Python Projects\NexICAT\propnex\Lib\site-packages\arcgis\learn\models_arcgis_model.py:798, in ArcGISModel.lr_find(self, allow_plot)
795 self.learn.lr_find()
796 except Exception as e:
797 # if some error comes in lr_find
--> 798 raise e
799 finally:
800 self.learn.metrics = metrics

File D:\Python Projects\NexICAT\propnex\Lib\site-packages\arcgis\learn\models_arcgis_model.py:795, in ArcGISModel.lr_find(self, allow_plot)
791 with tempfile.TemporaryDirectory(
792 prefix="arcgisTemp_"
793 ) as _tempfolder:
794 self.learn.path = Path(_tempfolder)
--> 795 self.learn.lr_find()
796 except Exception as e:
797 # if some error comes in lr_find
798 raise e

File D:\Python Projects\NexICAT\propnex\Lib\site-packages\arcgis\learn_utils\fastai_tf_fit.py:634, in tf_lr_find(learn, start_lr, end_lr, num_it, stop_div, **kwargs)
632 cb = TfLRFinder(learn, start_lr, end_lr, num_it, stop_div)
633 a = int(np.ceil(num_it / len(learn.data.train_dl)))
--> 634 learn.fit(a, start_lr, callbacks=[cb], **kwargs)

File D:\Python Projects\NexICAT\propnex\Lib\site-packages\arcgis\learn_utils\fastai_tf_fit.py:370, in TfLearner.fit(self, epochs, lr, wd, callbacks)
368 self.create_opt(lr, wd)
369 callbacks = [cb(self) for cb in self.callback_fns] + listify(callbacks)
--> 370 tf_fit(
371 epochs,
372 self.model,
373 self.loss_func,
374 opt=self.opt,
375 data=self.data,
376 metrics=self.metrics,
377 callbacks=self.callbacks + callbacks,
378 )

File D:\Python Projects\NexICAT\propnex\Lib\site-packages\arcgis\learn_utils\fastai_tf_fit.py:300, in tf_fit(epochs, model, loss_func, opt, data, callbacks, metrics)
298 except Exception as e:
299 exception = e
--> 300 raise e
301 finally:
302 cb_handler.on_train_end(exception)

File D:\Python Projects\NexICAT\propnex\Lib\site-packages\arcgis\learn_utils\fastai_tf_fit.py:282, in tf_fit(epochs, model, loss_func, opt, data, callbacks, metrics)
280 xb, yb = _pytorch_to_tf_batch(xb), _pytorch_to_tf(yb)
281 xb, yb = cb_handler.on_batch_begin(xb, yb)
--> 282 loss = tf_loss_batch(model, xb, yb, loss_func, opt, cb_handler)
283 if cb_handler.on_batch_end(loss):
284 break

File D:\Python Projects\NexICAT\propnex\Lib\site-packages\arcgis\learn_utils\fastai_tf_fit.py:188, in tf_loss_batch(model, xb, yb, loss_func, opt, cb_handler)
186 grads = tape.gradient(loss, model.trainable_variables)
187 cb_handler.on_backward_end()
--> 188 opt.apply_gradients(zip(grads, model.trainable_variables))
189 cb_handler.on_step_end()
190 else:

File D:\Python Projects\NexICAT\propnex\Lib\site-packages\arcgis\learn_utils\fastai_tf_fit.py:673, in TfOptimWrapper.apply_gradients(self, grads_and_vars)
671 if next_var[0] is None:
672 continue
--> 673 opt.apply_gradients([next_var])

File D:\Python Projects\NexICAT\propnex\Lib\site-packages\keras\optimizers\optimizer.py:1174, in Optimizer.apply_gradients(self, grads_and_vars, name, skip_gradients_aggregation, **kwargs)
1172 if not skip_gradients_aggregation and experimental_aggregate_gradients:
1173 grads_and_vars = self.aggregate_gradients(grads_and_vars)
-> 1174 return super().apply_gradients(grads_and_vars, name=name)

File D:\Python Projects\NexICAT\propnex\Lib\site-packages\keras\optimizers\optimizer.py:650, in _BaseOptimizer.apply_gradients(self, grads_and_vars, name)
648 self._apply_weight_decay(trainable_variables)
649 grads_and_vars = list(zip(grads, trainable_variables))
--> 650 iteration = self._internal_apply_gradients(grads_and_vars)
652 # Apply variable constraints after applying gradients.
653 for variable in trainable_variables:

File D:\Python Projects\NexICAT\propnex\Lib\site-packages\keras\optimizers\optimizer.py:1200, in Optimizer._internal_apply_gradients(self, grads_and_vars)
1199 def _internal_apply_gradients(self, grads_and_vars):
-> 1200 return tf.internal.distribute.interim.maybe_merge_call(
1201 self._distributed_apply_gradients_fn,
1202 self._distribution_strategy,
1203 grads_and_vars,
1204 )

File D:\Python Projects\NexICAT\propnex\Lib\site-packages\tensorflow\python\distribute\merge_call_interim.py:51, in maybe_merge_call(fn, strategy, *args, **kwargs)
31 """Maybe invoke fn via merge_call which may or may not be fulfilled.
32
33 The caller of this utility function requests to invoke fn via merge_call
(...)
48 The return value of the fn call.
49 """
50 if strategy_supports_no_merge_call():
---> 51 return fn(strategy, *args, **kwargs)
52 else:
53 return distribution_strategy_context.get_replica_context().merge_call(
54 fn, args=args, kwargs=kwargs)

File D:\Python Projects\NexICAT\propnex\Lib\site-packages\keras\optimizers\optimizer.py:1250, in Optimizer._distributed_apply_gradients_fn(self, distribution, grads_and_vars, **kwargs)
1247 return self._update_step(grad, var)
1249 for grad, var in grads_and_vars:
-> 1250 distribution.extended.update(
1251 var, apply_grad_to_update_var, args=(grad,), group=False
1252 )
1254 if self.use_ema:
1255 _, var_list = zip(*grads_and_vars)

File D:\Python Projects\NexICAT\propnex\Lib\site-packages\tensorflow\python\distribute\distribute_lib.py:2637, in StrategyExtendedV2.update(self, var, fn, args, kwargs, group)
2634 fn = autograph.tf_convert(
2635 fn, autograph_ctx.control_status_ctx(), convert_by_default=False)
2636 with self._container_strategy().scope():
-> 2637 return self._update(var, fn, args, kwargs, group)
2638 else:
2639 return self._replica_ctx_update(
2640 var, fn, args=args, kwargs=kwargs, group=group)

File D:\Python Projects\NexICAT\propnex\Lib\site-packages\tensorflow\python\distribute\distribute_lib.py:3710, in _DefaultDistributionExtended._update(self, var, fn, args, kwargs, group)
3707 def _update(self, var, fn, args, kwargs, group):
3708 # The implementations of _update() and _update_non_slot() are identical
3709 # except _update() passes var as the first argument to fn().
-> 3710 return self._update_non_slot(var, fn, (var,) + tuple(args), kwargs, group)

File D:\Python Projects\NexICAT\propnex\Lib\site-packages\tensorflow\python\distribute\distribute_lib.py:3716, in _DefaultDistributionExtended._update_non_slot(self, colocate_with, fn, args, kwargs, should_group)
3712 def _update_non_slot(self, colocate_with, fn, args, kwargs, should_group):
3713 # TODO(josh11b): Figure out what we should be passing to UpdateContext()
3714 # once that value is used for something.
3715 with UpdateContext(colocate_with):
-> 3716 result = fn(*args, **kwargs)
3717 if should_group:
3718 return result

File D:\Python Projects\NexICAT\propnex\Lib\site-packages\tensorflow\python\autograph\impl\api.py:595, in call_with_unspecified_conversion_status.<locals>.wrapper(*args, **kwargs)
593 def wrapper(*args, **kwargs):
594 with ag_ctx.ControlStatusCtx(status=ag_ctx.Status.UNSPECIFIED):
--> 595 return func(*args, **kwargs)

File D:\Python Projects\NexICAT\propnex\Lib\site-packages\keras\optimizers\optimizer.py:1247, in Optimizer._distributed_apply_gradients_fn.<locals>.apply_grad_to_update_var(var, grad)
1245 return self._update_step_xla(grad, var, id(self._var_key(var)))
1246 else:
-> 1247 return self._update_step(grad, var)

File D:\Python Projects\NexICAT\propnex\Lib\site-packages\keras\optimizers\optimizer.py:232, in _BaseOptimizer._update_step(self, gradient, variable)
230 return
231 if self._var_key(variable) not in self._index_dict:
--> 232 raise KeyError(
233 f"The optimizer cannot recognize variable {variable.name}. "
234 "This usually means you are trying to call the optimizer to "
235 "update different parts of the model separately. Please call "
236 "optimizer.build(variables) with the full list of trainable "
237 "variables before the training loop or use legacy optimizer "
238 f"`tf.keras.optimizers.legacy.{self.class.name}."
239 )
240 self.update_step(gradient, variable)

KeyError: 'The optimizer cannot recognize variable bn_Conv1/beta:0. This usually means you are trying to call the optimizer to update different parts of the model separately. Please call optimizer.build(variables) with the full list of trainable variables before the training loop or use legacy optimizer `tf.keras.optimizers.legacy.Adam.'

Screenshots
If applicable, add screenshots to help explain your problem.

Expected behavior
The images should have been read and the learning_rate should have been found out.
I have used the notebook file from: https://github.com/Esri/arcgis-python-api/blob/master/samples/04_gis_analysts_data_scientists/wildlife_species_identification_in_camera_trap_images.ipynb

Platform (please complete the following information):

OS: Windows
Browser [e.g. chrome, safari]: Chrome
Python API Version [e.g. 1.6.2] (you can get this by typing print(arcgis.__version__): 2.3.0

Additional context
The images I have used are my own. Adding some images for reference.

Kindly help.

The text was updated successfully, but these errors were encountered:

nanaeaubry · 2024-06-20T12:24:49Z

@priyankatuteja Can you please help with this?

Daremitsu1 added the bug label Jun 20, 2024

nanaeaubry assigned priyankatuteja Jun 20, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Error while trying to run Tensorflow backend in FeatureClassifier #1849

Error while trying to run Tensorflow backend in FeatureClassifier #1849

Daremitsu1 commented Jun 20, 2024

nanaeaubry commented Jun 20, 2024

Error while trying to run Tensorflow backend in FeatureClassifier #1849

Error while trying to run Tensorflow backend in FeatureClassifier #1849

Comments

Daremitsu1 commented Jun 20, 2024

nanaeaubry commented Jun 20, 2024