monitor
– Monitor Training of Neural Networks¶
- class neural_monitor.monitor.Monitor[source]¶
Collects statistics and displays the results using various backends. The collected stats are stored in
<root>/<model_name>/<prefix><#id>
where #id is automatically assigned each time a new run starts.The following snippet shows how to plot smoothed training losses and save images from the current iteration, and then display them every 100 iterations.
from neural_monitor import monitor as mon # Tensorboard is turned on by default mon.initialize(model_name='foo-model', print_freq=100, use_tensorboard=True) ... def calculate_loss(pred, gt): ... training_loss = ... mon.plot('training loss', loss, smooth=.99, filter_outliers=True) def calculate_acc(pred, gt): accuracy = ... mon.plot('training acc', accuracy, smooth=.99, filter_outliers=True) ... for epoch in mon.iter_epoch(range(n_epochs)): for data in mon.iter_batch(data_loader): pred = net(data) calculate_loss(pred, gt) calculate_acc(pred, gt) mon.imwrite('input images', data['images'], latest_only=True) mon.dump('checkpoint.pt', { 'model_state_dict': model.state_dict(), 'optimizer_state_dict': optimizer.state_dict(), ... }, method='torch', keep=5) # keep only 5 latest checkpoints ...
- current_folder
path to the current run.
- writer
an instance of Tensorboard’s
SummaryWriter
when use_tensorboard is set toTrue
.- plot_folder
path to the folder containing the collected plots.
- file_folder
path to the folder containing the collected files.
- image_folder
path to the folder containing the collected images.
- hist_folder
path to the folder containing the collected histograms.
- backup(files_or_folders: Union[str, List[str]], ignores: Union[str, List[str]] = None, includes: Union[str, List[str]] = None)[source]¶
saves a copy of the given files to
current_folder
. Accepts a str or list/tuple of file or folder names. You can backup your codes and/or config files for later use.- Parameters
files_or_folders – files or folders to be saved.
ignores – files or patterns to ignore. Default:
None
.includes – files or patterns to include. Default:
None
.
- Returns
None
.
- clear_hist_stats(key: Union[int, str, Tuple])[source]¶
removes the collected statistics for histogram plot of the specified key.
- Parameters
key – the name of the histogram collection.
- Returns
None
.
- clear_num_stats(key)[source]¶
removes the collected statistics for scalar plot of the specified key.
- Parameters
key – the name of the scalar collection.
- Returns
None
.
- dump(name: str, obj: Any, method: str = 'pickle', keep: int = - 1, **kwargs)[source]¶
saves the given object.
- Parameters
name – name of the file to be saved.
obj – object to be saved.
method –
str
orcallable
. Ifcallable
, it should be a custom method to dump object. There are 3 types ofstr
.'pickle'
: usepickle.dump()
to store object.'torch'
: usetorch.save()
to store object.'txt'
: usenumpy.savetxt()
to store object.Default:
'pickle'
.keep – the number of versions of the saved file to keep. Default: -1 (keeps only the latest version).
kwargs – additional keyword arguments to the underlying save function.
- Returns
None
.
- dump_model(network, use_tensorboard=False, *args, **kwargs)[source]¶
saves a string representation of the given neural net.
- Parameters
network – neural net to be saved as string representation.
use_tensorboard – use tensorboard to save network’s graph.
args – additional arguments to Tensorboard’s
SummaryWriter()
when use_tensorboard isTrue
.kwargs – additional keyword arguments to Tensorboard’s
SummaryWriter()
when ~se_tensorboard isTrue
.
- Returns
None
.
- dump_rep(name, obj)[source]¶
saves a string representation of the given object.
- Parameters
name – name of the txt file containing the string representation.
obj – object to saved as string representation.
- Returns
None
.
- property epoch: int¶
returns the current epoch.
- Returns
_last_epoch
.
- flush()[source]¶
executes all the scheduled plots. Do not call this if using
Monitor
’s context manager mode.- Returns
None
.
- hist(name, value: Union[torch.Tensor, numpy.ndarray], n_bins: int = 20, latest_only: bool = False, **kwargs)[source]¶
schedules a histogram plot of (a batch of) points. A
matplotlib
figure will be rendered and saved everyprint_freq
iterations.- Parameters
name – name of the figure to be saved. Must be unique among plots.
value – any-dim tensor to be histogrammed.
n_bins – number of bins of the histogram.
latest_only – whether to save only the latest statistics or keep everything from beginning.
kwargs – additional options to tensorboard
- Returns
None
.
- property hist_stats¶
returns the collected tensors from beginning.
- Returns
_hist_since_beginning
.
- imwrite(name: str, value: Union[torch.Tensor, numpy.ndarray], latest_only: Optional[bool] = False, **kwargs)[source]¶
schedules to save images. The images will be rendered and saved every
print_freq
iterations. There are some assumptions about input data:If the input is
'uint8'
it is an 8-bit image.If the input is
'float32'
, its values lie between0
and1
.If the input has 3 dims, the shape is
[h, w, 3]
or[h, w, 1]
.If the channel dim is different from 3 or 1, it will be considered as multiple gray images.
- Parameters
name – name of the figure to be saved. Must be unique among plots.
value – 2D, 3D or 4D tensor to be plotted. The expected shape is
(H, W)
for 2D tensor,(H, W, C)
for 3D tensor and(N, C, H, W)
for 4D tensor. If the number of channels is other than 3 or 1, each channel is saved as a gray image.latest_only – whether to save only the latest statistics or keep everything from beginning.
kwargs – additional options to tensorboard.
- Returns
None
.
- initialize(model_name: Optional[str] = None, root: Optional[str] = None, current_folder: Optional[str] = None, print_freq: Optional[int] = 1, num_iters: Optional[int] = None, prefix: Optional[str] = 'run', use_tensorboard: Optional[bool] = True, with_git: Optional[bool] = False) → None[source]¶
- Parameters
model_name – name of the experimented model. Default:
'my-model'
.root – root path to store results. Default:
'results'
.current_folder – the folder that the experiment is currently dump to. Note that if current_folder already exists, all the contents will be loaded. This option can be used for loading a trained model. Default:
None
.print_freq – frequency of stdout. Default: 1.
num_iters – number of iterations per epoch. If not provided, it will be calculated after one epoch. Default:
None
.prefix – a common prefix that is shared between folder names of different runs. Default:
'run'
.use_tensorboard – whether to use Tensorboard. Default:
True
.with_git – whether to retrieve some Git information. Should be used only when the project is initialized with Git. Default:
False
.
- Returns
None
.
- property iter: int¶
returns the current iteration.
- Returns
_iter
.
- iter_batch(iterator: Iterable) → Any[source]¶
tracks training iteration and returns the item in iterator.
- Parameters
iterator – the batch iterator. For e.g.,
enumerator(loader)
.- Returns
a generator over iterator.
>>> from neuralnet_pytorch import monitor as mon >>> mon.print_freq = 1000 >>> data_loader = ... >>> num_epochs = 10 >>> for epoch in mon.iter_epoch(range(num_epochs)): ... for idx, data in mon.iter_batch(enumerate(data_loader)): ... # do something here
- iter_epoch(iterator: Iterable) → Any[source]¶
tracks training epoch and returns the item in iterator.
- Parameters
iterator – the epoch iterator. For e.g.,
range(num_epochs)
.- Returns
a generator over iterator.
>>> from neuralnet_pytorch import monitor as mon >>> mon.print_freq = 1000 >>> num_epochs = 10 >>> for epoch in mon.iter_epoch(range(mon.epoch, num_epochs)) ... # do something here
- load(file: str, method: str = 'pickle', version: int = - 1, **kwargs)[source]¶
loads from the given file.
- Parameters
file – name of the saved file without version.
method –
str
orcallable
. Ifcallable
, it should be a custom method to load object. There are 3 types ofstr
.'pickle'
: usepickle.dump()
to store object.'torch'
: usetorch.save()
to store object.'txt'
: usenumpy.savetxt()
to store object.Default:
'pickle'
.version – the version of the saved file to load. Default: -1 (loads the latest version of the saved file).
kwargs – additional keyword arguments to the underlying load function.
- Returns
None
.
- property model_name: str¶
returns the name of the model.
- Returns
_model_name
.
- property num_stats¶
returns the collected scalar statistics from beginning.
- Returns
_num_since_beginning
.
- plot(name: str, value: Union[torch.Tensor, numpy.ndarray, float], smooth: Optional[float] = 0, filter_outliers: Optional[bool] = True, **kwargs)[source]¶
schedules a plot of scalar value. A
matplotlib
figure will be rendered and saved everyprint_freq
iterations.- Parameters
name – name of the figure to be saved. Must be unique among plots.
value – scalar value to be plotted.
smooth – a value between
0
and1
to define the smoothing window size. Seesmooth()
. Default:0
.filter_outliers – whether to filter out outliers in plot. This affects only the plot and not the raw statistics. Default: True.
kwargs – additional options to tensorboard.
- Returns
None
.
- plot_matrix(name: str, value: Union[torch.Tensor, numpy.ndarray, float], labels: Union[List[str], List[List[str]]] = None, show_values: bool = False)[source]¶
plots the given matrix with colorbar and labels if provided.
- Parameters
name – name of the figure to be saved. Must be unique among plots.
value – matrix value to be plotted.
labels – labels of each axis. Can be a list/tuple of strings or a nested list/tuple. Defaults: None.
show_values – show values of the matrix
- Returns
None
.
- property prefix: str¶
returns the prefix of saved folders.
- Returns
_prefix
.
- reset()[source]¶
factory-resets the monitor object. This includes clearing all the collected data, set the iteration and epoch counters to 0, and reset the timer.
- Returns
None
.
- scatter(name: str, value: Union[torch.Tensor, numpy.ndarray], latest_only: bool = False, **kwargs)[source]¶
schedules a scattor plot of (a batch of) points. A 3D
matplotlib
figure will be rendered and saved everyprint_freq
iterations.- Parameters
name – name of the figure to be saved. Must be unique among plots.
value – 2D or 3D tensor to be plotted. The last dim should be 3.
latest_only – whether to save only the latest statistics or keep everything from beginning.
kwargs – additional options to tensorboard.
- Returns
None
.
- neural_monitor.monitor.collect_tracked_variables(name=None, return_name=False)[source]¶
Gets tracked variable given name.
- Parameters
name – name of the tracked variable. can be
str
or``list``/tuple
ofstr``s. If ``None
, all the tracked variables will be returned.return_name – whether to return the names of the tracked variables.
- Returns
the tracked variables.
- neural_monitor.monitor.get_tracked_variables() → Dict[source]¶
Retrieves the values of tracked variables.
- Returns
a dictionary containing the values of tracked variables associated with the given names.
- neural_monitor.monitor.track(name: str, x: Union[torch.Tensor, torch.nn.Module], direction: Optional[str] = None) → Union[torch.Tensor, torch.nn.Module][source]¶
An identity function that registers hooks to track the value and gradient of the specified tensor.
Here is an example of how to track an intermediate output
from neural_monitor import track, get_tracked_variables import nueralnet_pytorch as nnt input = ... conv1 = track('op', nn.Conv2d(dim, 4, 3), 'all') conv2 = nn.Conv2d(4, 5, 3) intermediate = conv1(input) output = track('conv2_output', conv2(intermediate), 'all') loss = T.sum(output ** 2) loss.backward(retain_graph=True) d_inter = T.autograd.grad(loss, intermediate, retain_graph=True) d_out = T.autograd.grad(loss, output) tracked = get_tracked_variables() testing.assert_allclose(tracked['conv2_output'], nnt.utils.to_numpy(output)) testing.assert_allclose(np.stack(tracked['grad_conv2_output']), nnt.utils.to_numpy(d_out[0])) testing.assert_allclose(tracked['op'], nnt.utils.to_numpy(intermediate)) for d_inter_, tracked_d_inter_ in zip(d_inter, tracked['grad_op_output']): testing.assert_allclose(tracked_d_inter_, nnt.utils.to_numpy(d_inter_))
- Parameters
name – name of the tracked tensor.
x – tensor or module to be tracked. If module, the output of the module will be tracked.
direction –
there are 4 options
None
: tracks only value.'forward'
: tracks only value.'backward'
: tracks only gradient.'all'
: tracks both value and gradient.Default:
None
.
- Returns
x.