monitor – Monitor Training of Neural Networks¶
- class neural_monitor.monitor.Monitor[source]¶
Collects statistics and displays the results using various backends. The collected stats are stored in
<root>/<model_name>/<prefix><#id>where #id is automatically assigned each time a new run starts.The following snippet shows how to plot smoothed training losses and save images from the current iteration, and then display them every 100 iterations.
from neural_monitor import monitor as mon # Tensorboard is turned on by default mon.initialize(model_name='foo-model', print_freq=100, use_tensorboard=True) ... def calculate_loss(pred, gt): ... training_loss = ... mon.plot('training loss', loss, smooth=.99, filter_outliers=True) def calculate_acc(pred, gt): accuracy = ... mon.plot('training acc', accuracy, smooth=.99, filter_outliers=True) ... for epoch in mon.iter_epoch(range(n_epochs)): for data in mon.iter_batch(data_loader): pred = net(data) calculate_loss(pred, gt) calculate_acc(pred, gt) mon.imwrite('input images', data['images'], latest_only=True) mon.dump('checkpoint.pt', { 'model_state_dict': model.state_dict(), 'optimizer_state_dict': optimizer.state_dict(), ... }, method='torch', keep=5) # keep only 5 latest checkpoints ...
- current_folder
path to the current run.
- writer
an instance of Tensorboard’s
SummaryWriterwhen use_tensorboard is set toTrue.- plot_folder
path to the folder containing the collected plots.
- file_folder
path to the folder containing the collected files.
- image_folder
path to the folder containing the collected images.
- hist_folder
path to the folder containing the collected histograms.
- backup(files_or_folders: Union[str, List[str]], ignores: Union[str, List[str]] = None, includes: Union[str, List[str]] = None)[source]¶
saves a copy of the given files to
current_folder. Accepts a str or list/tuple of file or folder names. You can backup your codes and/or config files for later use.- Parameters
files_or_folders – files or folders to be saved.
ignores – files or patterns to ignore. Default:
None.includes – files or patterns to include. Default:
None.
- Returns
None.
- clear_hist_stats(key: Union[int, str, Tuple])[source]¶
removes the collected statistics for histogram plot of the specified key.
- Parameters
key – the name of the histogram collection.
- Returns
None.
- clear_num_stats(key)[source]¶
removes the collected statistics for scalar plot of the specified key.
- Parameters
key – the name of the scalar collection.
- Returns
None.
- dump(name: str, obj: Any, method: str = 'pickle', keep: int = - 1, **kwargs)[source]¶
saves the given object.
- Parameters
name – name of the file to be saved.
obj – object to be saved.
method –
strorcallable. Ifcallable, it should be a custom method to dump object. There are 3 types ofstr.'pickle': usepickle.dump()to store object.'torch': usetorch.save()to store object.'txt': usenumpy.savetxt()to store object.Default:
'pickle'.keep – the number of versions of the saved file to keep. Default: -1 (keeps only the latest version).
kwargs – additional keyword arguments to the underlying save function.
- Returns
None.
- dump_model(network, use_tensorboard=False, *args, **kwargs)[source]¶
saves a string representation of the given neural net.
- Parameters
network – neural net to be saved as string representation.
use_tensorboard – use tensorboard to save network’s graph.
args – additional arguments to Tensorboard’s
SummaryWriter()when use_tensorboard isTrue.kwargs – additional keyword arguments to Tensorboard’s
SummaryWriter()when ~se_tensorboard isTrue.
- Returns
None.
- dump_rep(name, obj)[source]¶
saves a string representation of the given object.
- Parameters
name – name of the txt file containing the string representation.
obj – object to saved as string representation.
- Returns
None.
- property epoch: int¶
returns the current epoch.
- Returns
_last_epoch.
- flush()[source]¶
executes all the scheduled plots. Do not call this if using
Monitor’s context manager mode.- Returns
None.
- hist(name, value: Union[torch.Tensor, numpy.ndarray], n_bins: int = 20, latest_only: bool = False, **kwargs)[source]¶
schedules a histogram plot of (a batch of) points. A
matplotlibfigure will be rendered and saved everyprint_freqiterations.- Parameters
name – name of the figure to be saved. Must be unique among plots.
value – any-dim tensor to be histogrammed.
n_bins – number of bins of the histogram.
latest_only – whether to save only the latest statistics or keep everything from beginning.
kwargs – additional options to tensorboard
- Returns
None.
- property hist_stats¶
returns the collected tensors from beginning.
- Returns
_hist_since_beginning.
- imwrite(name: str, value: Union[torch.Tensor, numpy.ndarray], latest_only: Optional[bool] = False, **kwargs)[source]¶
schedules to save images. The images will be rendered and saved every
print_freqiterations. There are some assumptions about input data:If the input is
'uint8'it is an 8-bit image.If the input is
'float32', its values lie between0and1.If the input has 3 dims, the shape is
[h, w, 3]or[h, w, 1].If the channel dim is different from 3 or 1, it will be considered as multiple gray images.
- Parameters
name – name of the figure to be saved. Must be unique among plots.
value – 2D, 3D or 4D tensor to be plotted. The expected shape is
(H, W)for 2D tensor,(H, W, C)for 3D tensor and(N, C, H, W)for 4D tensor. If the number of channels is other than 3 or 1, each channel is saved as a gray image.latest_only – whether to save only the latest statistics or keep everything from beginning.
kwargs – additional options to tensorboard.
- Returns
None.
- initialize(model_name: Optional[str] = None, root: Optional[str] = None, current_folder: Optional[str] = None, print_freq: Optional[int] = 1, num_iters: Optional[int] = None, prefix: Optional[str] = 'run', use_tensorboard: Optional[bool] = True, with_git: Optional[bool] = False) → None[source]¶
- Parameters
model_name – name of the experimented model. Default:
'my-model'.root – root path to store results. Default:
'results'.current_folder – the folder that the experiment is currently dump to. Note that if current_folder already exists, all the contents will be loaded. This option can be used for loading a trained model. Default:
None.print_freq – frequency of stdout. Default: 1.
num_iters – number of iterations per epoch. If not provided, it will be calculated after one epoch. Default:
None.prefix – a common prefix that is shared between folder names of different runs. Default:
'run'.use_tensorboard – whether to use Tensorboard. Default:
True.with_git – whether to retrieve some Git information. Should be used only when the project is initialized with Git. Default:
False.
- Returns
None.
- property iter: int¶
returns the current iteration.
- Returns
_iter.
- iter_batch(iterator: Iterable) → Any[source]¶
tracks training iteration and returns the item in iterator.
- Parameters
iterator – the batch iterator. For e.g.,
enumerator(loader).- Returns
a generator over iterator.
>>> from neuralnet_pytorch import monitor as mon >>> mon.print_freq = 1000 >>> data_loader = ... >>> num_epochs = 10 >>> for epoch in mon.iter_epoch(range(num_epochs)): ... for idx, data in mon.iter_batch(enumerate(data_loader)): ... # do something here
- iter_epoch(iterator: Iterable) → Any[source]¶
tracks training epoch and returns the item in iterator.
- Parameters
iterator – the epoch iterator. For e.g.,
range(num_epochs).- Returns
a generator over iterator.
>>> from neuralnet_pytorch import monitor as mon >>> mon.print_freq = 1000 >>> num_epochs = 10 >>> for epoch in mon.iter_epoch(range(mon.epoch, num_epochs)) ... # do something here
- load(file: str, method: str = 'pickle', version: int = - 1, **kwargs)[source]¶
loads from the given file.
- Parameters
file – name of the saved file without version.
method –
strorcallable. Ifcallable, it should be a custom method to load object. There are 3 types ofstr.'pickle': usepickle.dump()to store object.'torch': usetorch.save()to store object.'txt': usenumpy.savetxt()to store object.Default:
'pickle'.version – the version of the saved file to load. Default: -1 (loads the latest version of the saved file).
kwargs – additional keyword arguments to the underlying load function.
- Returns
None.
- property model_name: str¶
returns the name of the model.
- Returns
_model_name.
- property num_stats¶
returns the collected scalar statistics from beginning.
- Returns
_num_since_beginning.
- plot(name: str, value: Union[torch.Tensor, numpy.ndarray, float], smooth: Optional[float] = 0, filter_outliers: Optional[bool] = True, **kwargs)[source]¶
schedules a plot of scalar value. A
matplotlibfigure will be rendered and saved everyprint_freqiterations.- Parameters
name – name of the figure to be saved. Must be unique among plots.
value – scalar value to be plotted.
smooth – a value between
0and1to define the smoothing window size. Seesmooth(). Default:0.filter_outliers – whether to filter out outliers in plot. This affects only the plot and not the raw statistics. Default: True.
kwargs – additional options to tensorboard.
- Returns
None.
- plot_matrix(name: str, value: Union[torch.Tensor, numpy.ndarray, float], labels: Union[List[str], List[List[str]]] = None, show_values: bool = False)[source]¶
plots the given matrix with colorbar and labels if provided.
- Parameters
name – name of the figure to be saved. Must be unique among plots.
value – matrix value to be plotted.
labels – labels of each axis. Can be a list/tuple of strings or a nested list/tuple. Defaults: None.
show_values – show values of the matrix
- Returns
None.
- property prefix: str¶
returns the prefix of saved folders.
- Returns
_prefix.
- reset()[source]¶
factory-resets the monitor object. This includes clearing all the collected data, set the iteration and epoch counters to 0, and reset the timer.
- Returns
None.
- scatter(name: str, value: Union[torch.Tensor, numpy.ndarray], latest_only: bool = False, **kwargs)[source]¶
schedules a scattor plot of (a batch of) points. A 3D
matplotlibfigure will be rendered and saved everyprint_freqiterations.- Parameters
name – name of the figure to be saved. Must be unique among plots.
value – 2D or 3D tensor to be plotted. The last dim should be 3.
latest_only – whether to save only the latest statistics or keep everything from beginning.
kwargs – additional options to tensorboard.
- Returns
None.
- neural_monitor.monitor.collect_tracked_variables(name=None, return_name=False)[source]¶
Gets tracked variable given name.
- Parameters
name – name of the tracked variable. can be
stror``list``/tupleofstr``s. If ``None, all the tracked variables will be returned.return_name – whether to return the names of the tracked variables.
- Returns
the tracked variables.
- neural_monitor.monitor.get_tracked_variables() → Dict[source]¶
Retrieves the values of tracked variables.
- Returns
a dictionary containing the values of tracked variables associated with the given names.
- neural_monitor.monitor.track(name: str, x: Union[torch.Tensor, torch.nn.Module], direction: Optional[str] = None) → Union[torch.Tensor, torch.nn.Module][source]¶
An identity function that registers hooks to track the value and gradient of the specified tensor.
Here is an example of how to track an intermediate output
from neural_monitor import track, get_tracked_variables import nueralnet_pytorch as nnt input = ... conv1 = track('op', nn.Conv2d(dim, 4, 3), 'all') conv2 = nn.Conv2d(4, 5, 3) intermediate = conv1(input) output = track('conv2_output', conv2(intermediate), 'all') loss = T.sum(output ** 2) loss.backward(retain_graph=True) d_inter = T.autograd.grad(loss, intermediate, retain_graph=True) d_out = T.autograd.grad(loss, output) tracked = get_tracked_variables() testing.assert_allclose(tracked['conv2_output'], nnt.utils.to_numpy(output)) testing.assert_allclose(np.stack(tracked['grad_conv2_output']), nnt.utils.to_numpy(d_out[0])) testing.assert_allclose(tracked['op'], nnt.utils.to_numpy(intermediate)) for d_inter_, tracked_d_inter_ in zip(d_inter, tracked['grad_op_output']): testing.assert_allclose(tracked_d_inter_, nnt.utils.to_numpy(d_inter_))
- Parameters
name – name of the tracked tensor.
x – tensor or module to be tracked. If module, the output of the module will be tracked.
direction –
there are 4 options
None: tracks only value.'forward': tracks only value.'backward': tracks only gradient.'all': tracks both value and gradient.Default:
None.
- Returns
x.