Subpackage: owl.net¶
owl.net.net module¶
A package for implementing Caffe-like network structure using Owl APIs.
The package implements Caffe-like network structure with some minor differences. It uses Caffe-defined protobuf as core data structure so Caffe users could easily adapt to this. The package serves the purpose of:
- Quick deployment of neural network training using configure file.
- Demonstrate the power of
owl
package (it takes only several hundreds LOC to implement Caffe and run it on dataflow engine).
-
class
owl.net.net.
ComputeUnit
(params)[source]¶ Bases:
object
Interface for each compute unit.
In
owl.net
, the network is graph (in fact a DAG) that is composed ofComputeUnit
s.ComputeUnit
is a wrap-up of Caffe’slayer
abstraction, but is more general and flexible in its function sigature.Variables: - params (caffe.LayerParameter) – layer parameter in Caffe’s proto structure
- name (str) – name of the unit; the name must be unique
- btm_names (list str) – names of the bottom units
- top_names (list str) – names of the top units
- int out_shape (list) –
Note
params
,name
,btm_names
andtop_names
will be parsed from Caffe’s network description file.out_shape
should be set incompute_size()
-
compute_size
(from_btm, to_top)[source]¶ Calculate the output size of this unit
This function will be called before training during the
compute_size
phase. Thecompute_size
phase is a feed-forward-like phase, during which eachComputeUnit
, rather than calculating the output tensor but calculating the output size (list int) for the top units. The size is usually used to calculate the weight and bias size for initialization.Parameters: - from_btm (dict) – input size from bottom units
- to_top (dict) – output size to top units
-
forward
(from_btm, to_top, phase)[source]¶ Function for forward propagation
This function will be called during forward-propagation. The function should take input in
from_btm
, perform customized computation, and then put the result into_top
. Bothfrom_btm
andto_top
aredict
type where key is astr
of name of the bottom/top units and value is anowl.NArray
served as input or output of the function.Parameters: - from_btm (dict) – input from bottom units
- to_top (dict) – output for top units
- phase (str) – name of the phase of the running. Currently either
"TRAIN"
or"TEST"
-
backward
(from_top, to_btm, phase)[source]¶ Function for backward propagation
This function will be called during backward-propagation. Similar to
forward()
, The function should take input infrom_top
, perform customized computation, and then put the result into_btm
. The function also need to calculate the gradient (if any) and save them to theweightgrad
field (see :py:meth:WeightedComputeUnit.weight_update).Parameters: - from_top (dict) – input from top units
- to_btm (dict) – output for top units
- phase (str) – name of the phase of the running. Currently either
"TRAIN"
or"TEST"
-
weight_update
(base_lr, base_weight_decay, momentum, batch_size)[source]¶ Function for weight update
This function will be called during weight update.
Parameters: - base_lr (float) – base learning rate
- base_weight_decay (float) – base weight decay
- momentum (float) – momentum value
- batch_size (int) – the size of the current minibatch
-
class
owl.net.net.
ComputeUnitSimple
(params)[source]¶ Bases:
owl.net.net.ComputeUnit
An auxiliary class for
ComputeUnit
that will only have one input unit and one output unit.-
compute_size
(from_btm, to_top)[source]¶ Set the
out_shape
as the same shape of the input. Inherited classes could override this function.
-
forward
(from_btm, to_top, phase)[source]¶ Transform the interface from multiple input/output to only one input/output function
ff()
.
-
ff
(act, phase)[source]¶ Function for forward-propagation
Parameters: - act (owl.NArray) – the activation from the bottom unit
- phase (str) – name of the phase of the running. Currently either
"TRAIN"
or"TEST"
Returns: the activation of this unit
Return type:
-
backward
(from_top, to_btm, phase)[source]¶ Transform the interface from multiple input/output to only one input/output function
bp()
.
-
bp
(sen)[source]¶ Function for backward-propagation
Parameters: sen (owl.NArray) – the sensitivity (or error derivative to the input) from the top unit Returns: the sensitivity of this unit Return type: owl.NArray
-
-
class
owl.net.net.
WeightedComputeUnit
(params)[source]¶ Bases:
owl.net.net.ComputeUnitSimple
An auxiliary class for
ComputeUnit
with weightsVariables: - weight (owl.NArray) – weight tensor
- weightdelta (owl.NArray) – momentum of weight
- weightgrad (owl.NArray) – gradient of weight
- bias (owl.NArray) – bias tensor
- biasdelta (owl.NArray) – momentum of bias
- biasgrad (owl.NArray) – gradient of bias
- blobs_lr (list float) – learning rate specific for this unit; a list of float represents: [weight_lr, bias_lr]
- weight_decay (list float) – weight decay specific for this unit; a list of float represents: [weight_wd, bias_wd]
-
class
owl.net.net.
LinearUnit
(params)[source]¶ Bases:
owl.net.net.ComputeUnitSimple
Compute unit for linear transformation
-
class
owl.net.net.
SigmoidUnit
(params)[source]¶ Bases:
owl.net.net.ComputeUnitSimple
Compute unit for Sigmoid non-linearity
-
class
owl.net.net.
ReluUnit
(params)[source]¶ Bases:
owl.net.net.ComputeUnitSimple
Compute unit for RELU non-linearity
-
class
owl.net.net.
TanhUnit
(params)[source]¶ Bases:
owl.net.net.ComputeUnitSimple
Compute unit for Hyperbolic Tangine non-linearity
-
class
owl.net.net.
PoolingUnit
(params)[source]¶ Bases:
owl.net.net.ComputeUnitSimple
Compute unit for Pooling
Note
The input and output is of size
[HWCN]
:H
: image heightW
: image widthC
: number of image channels (feature maps)N
: size of minibatch
-
class
owl.net.net.
DropoutUnit
(params)[source]¶ Bases:
owl.net.net.ComputeUnitSimple
Compute unit for dropout
-
class
owl.net.net.
SoftmaxUnit
(params)[source]¶ Bases:
owl.net.net.ComputeUnit
Compute unit for softmax
-
class
owl.net.net.
AccuracyUnit
(params)[source]¶ Bases:
owl.net.net.ComputeUnit
Compute unit for calculating accuracy
Note
In terms of Minerva’s lazy evaluation, the unit is a non-lazy one since it gets the actual contents (accuracy) out of an
owl.NArray
.
-
class
owl.net.net.
LRNUnit
(params)[source]¶ Bases:
owl.net.net.ComputeUnitSimple
Compute unit for LRN
-
class
owl.net.net.
ConcatUnit
(params)[source]¶ Bases:
owl.net.net.ComputeUnit
Compute unit for concatenation
Concatenate input arrays along the dimension specified by Caffe’s
concat_dim_caffe
-
class
owl.net.net.
FullyConnection
(params)[source]¶ Bases:
owl.net.net.WeightedComputeUnit
Compute unit for traditional fully connected layer
-
class
owl.net.net.
ConvConnection
(params)[source]¶ Bases:
owl.net.net.WeightedComputeUnit
Convolution operation
Note
The input and output is of size
[HWCN]
:H
: image heightW
: image widthC
: number of image channels (feature maps)N
: size of minibatch
-
compute_size
(from_btm, to_top)[source]¶ Compute the output size and also weight and bias size
Note
The weight(kernel) size is
[HWCiCo]
; bias shape is[Co]
:H
: kernel_heightW
: kernel_widthCi
: number of input channelsCo
: number of output channels
-
class
owl.net.net.
DataUnit
(params, num_gpu)[source]¶ Bases:
owl.net.net.ComputeUnit
The base class of dataunit.
Variables: - dp – dataprovider, different kind of dp load data from different formats
- generator – the iterator produced by dataprovider
-
class
owl.net.net.
LMDBDataUnit
(params, num_gpu)[source]¶ Bases:
owl.net.net.DataUnit
DataUnit load from LMDB.
Variables: params (caffe.LayerParameter) – lmdb data layer param defined by Caffe, params.data_param contains information about data source, parmas.transform_param mainly defines data augmentation operations
-
class
owl.net.net.
ImageDataUnit
(params, num_gpu)[source]¶ Bases:
owl.net.net.DataUnit
DataUnit load from raw images. :ivar caffe.LayerParameter params: image data layer param defined by Caffe, this is often used when data is limited. Loading from original image will be slower than loading from LMDB
-
class
owl.net.net.
ImageWindowDataUnit
(params, num_gpu)[source]¶ Bases:
owl.net.net.DataUnit
DataUnit load from image window patches. :ivar caffe.LayerParameter params: image window data layer param defined by Caffe, this is often used when data is limited and object bounding box is given
-
class
owl.net.net.
Net
[source]¶ The class for neural network structure
The Net is basically a graph (DAG), of which each node is a
ComputeUnit
.Variables: - units (list owl.net.ComputeUnit) – all the
ComputeUnit
s. - adjacent (list list str) – the adjacent list (units are represented by their name)
- reverse_adjacent (list list str) – the reverse adjacent list (units are represented by their name)
- name_to_uid (dict) – a map from units’ name to the unit object
- loss_uids (list int) – all the units for computing loss
- accuracy_uids (list int) – all the units for calculating accuracy
-
add_unit
(unit)[source]¶ Method for adding units into the graph
Parameters: unit (owl.net.ComputeUnit) – the unit to add
-
connect
(u1, u2)[source]¶ Method for connecting two units
Parameters: - u1 (str) – name of the bottom unit
- u2 (str) – name of the top unit
-
get_units_by_name
(name)[source]¶ Get
ComputeUnit
object by its nameParameters: name (str) – unit name Returns: the compute unit object of that name Return type: owl.net.ComputeUnit
-
get_loss_units
()[source]¶ Get all
ComputeUnit
object for lossReturns: all compute unit object for computing loss Return type: list owl.net.ComputeUnit
-
get_accuracy_units
()[source]¶ Get all
ComputeUnit
object for accuracyReturns: all compute unit object for computing accuracy Return type: list owl.net.ComputeUnit
-
get_data_unit
(phase='TRAIN')[source]¶ Get the
ComputeUnit
object for data loadingParameters: phase (str) – phase name of the run Returns: the compute unit object for loading data Return type: owl.net.ComputeUnit
-
get_weighted_unit_ids
()[source]¶ Get ids for all :py:class:owl.net.WeightedComputeUnit
Returns: ids of all weighted compute unit Return type: list int
- units (list owl.net.ComputeUnit) – all the
owl.net.trainer module¶
-
class
owl.net.trainer.
NetTrainer
(solver_file, snapshot=0, num_gpu=1)[source]¶ Class for training neural network
Allows user to train using Caffe’s network configure format but on multiple GPUs. One could use NetTrainer as follows:
>>> trainer = NetTrainer(solver_file, snapshot, num_gpu) >>> trainer.build_net() >>> trainer.run()
Variables: - solver_file (str) – path of the solver file in Caffe’s proto format
- snapshot (int) – the idx of snapshot to start with
- num_gpu (int) – the number of gpu to use
-
build_net
()[source]¶ Build network structure using Caffe’s proto definition. It will also initialize the network either from given snapshot or from scratch (using proper initializer). During initialization, it will first try to load weight from snapshot. If failed, it will then initialize the weight accordingly.
-
run
(s)[source]¶ Run the training algorithm on multiple GPUs
The basic logic is similar to the traditional single GPU training code as follows (pseudo-code):
for epoch in range(MAX_EPOCH): for i in range(NUM_MINI_BATCHES): # load i^th minibatch minibatch = loader.load(i, MINI_BATCH_SIZE) net.ff(minibatch.data) net.bp(minibatch.label) grad = net.gradient() net.update(grad, MINI_BATCH_SIZE)
With Minerva’s lazy evaluation and dataflow engine, we are able to modify the above logic to perform data parallelism on multiple GPUs (pseudo-code):
for epoch in range(MAX_EPOCH): for i in range(0, NUM_MINI_BATCHES, NUM_GPU): gpu_grad = [None for i in range(NUM_GPU)] for gpuid in range(NUM_GPU): # specify which gpu following codes are running on owl.set_device(gpuid) # each minibatch is split among GPUs minibatch = loader.load(i + gpuid, MINI_BATCH_SIZE / NUM_GPU) net.ff(minibatch.data) net.bp(minibatch.label) gpu_grad[gpuid] = net.gradient() net.accumulate_and_update(gpu_grad, MINI_BATCH_SIZE)
So each GPU will take charge of one mini-mini batch training, and since all their
ff
,bp
andgradient
calculations are independent among each others, they could be paralleled naturally using Minerva’s DAG engine.The only problem let is
accumulate_and_update
of the the gradient from all GPUs. If we do it on one GPU, that GPU would become a bottleneck. The solution is to also partition the workload to different GPUs (pseudo-code):def accumulate_and_update(gpu_grad, MINI_BATCH_SIZE): num_layers = len(gpu_grad[0]) for layer in range(num_layers): upd_gpu = layer * NUM_GPU / num_layers # specify which gpu to update the layer owl.set_device(upd_gpu) for gid in range(NUM_GPU): if gid != upd_gpu: gpu_grad[upd_gpu][layer] += gpu_grad[gid][layer] net.update_layer(layer, gpu_grad[upd_gpu][layer], MINI_BATCH_SIZE)
Since the update of each layer is independent among each others, the update could be paralleled affluently. Minerva’s dataflow engine transparently handles the dependency resolving, scheduling and memory copying among different devices, so users don’t need to care about that.
-
class
owl.net.trainer.
MultiviewTester
(solver_file, softmax_layer_name, snapshot, gpu_idx=0)[source]¶ Class for performing multi-view testing
- Run it as::
>>> tester = MultiviewTester(solver_file, softmax_layer, snapshot, gpu_idx) >>> tester.build_net() >>> tester.run()
Variables: - solver_file (str) – path of the solver file in Caffe’s proto format
- snapshot (int) – the snapshot for testing
- softmax_layer_name (str) – name of the softmax layer that produce prediction
- gpu_idx (int) – which gpu to perform the test
-
class
owl.net.trainer.
FeatureExtractor
(solver_file, snapshot, gpu_idx=0)[source]¶ Class for extracting trained features Feature will be stored in a txt file as a matrix. The size of the feature matrix is [num_img, feature_dimension]
- Run it as::
>>> extractor = FeatureExtractor(solver_file, snapshot, gpu_idx) >>> extractor.build_net() >>> extractor.run(layer_name, feature_path)
Variables: - solver_file (str) – path of the solver file in Caffe’s proto format
- snapshot (int) – the snapshot for testing
- layer_name (str) – name of the ayer that produce feature
- gpu_idx (int) – which gpu to perform the test
owl.net.net_helper module¶
-
class
owl.net.net_helper.
CaffeNetBuilder
(solver_file)[source]¶ Class to build network from Caffe’s solver and configure file. :ivar str solver_file: Caffe’s solver file.
-
change_net
(net_file)[source]¶ You can mannually assign the network configure file and do not use the file provided in the solver :ivar str net_file: Caffe’s network configure file.
-
build_net
(owl_net, num_gpu=1)[source]¶ Parse the information from solver and network configure file and build the network and processing plan. :ivar num_gpu: the number of GPU to train in parallel should be provided in this function, it will tell the data layer how to slice a training batch
-
-
class
owl.net.net_helper.
CaffeModelLoader
(model_file, weightdir, snapshot)[source]¶ Class to convert Caffe’s caffemodel into numpy array files. Minerva use numpy array files to store and save model snapshots. :ivar str model_file: Caffe’s caffemodel :ivar str weightdir: directory to save numpy-array models :ivar int snapshot: snapshot index
owl.net.netio module¶
-
class
owl.net.netio.
ImageWindowDataProvider
(window_data_param, mm_batch_num)[source]¶ Class for Image Window Data Provider. This data provider will read the original image and crop out patches according to the given box position, then resize the patch to form batch.
Note
Layer type in Caffe’s configure file: WINDOW_DATA
Data format for each image:
# window meta data [Img_ind][Img_path][C][H][W][Window_num] # windows [label][overlap_ratio][upper][left][lower][right] [label][overlap_ratio][upper][left][lower][right] ...... [label][overlap_ratio][upper][left][lower][right]
Img_ind
: image indexImg_path
: image pathC
: number of image channels (feature maps)H
: image heightW
: image widthWindow_num
: number of window patcheslabel
: labeloverlap_ratio
: overlap ratio between the window and object bouding boxupper left lower right
: position of the window
-
class
owl.net.netio.
ImageListDataProvider
(image_data_param, transform_param, mm_batch_num)[source]¶ Class for Image Data Provider. This data provider will read from original data into RGB value, then resize the patch to form batch.
Note
Layer type in Caffe’s configure file: IMAGE_DATA
Data format for each image:
[Img_path][label_0][label_1]...[label_n]
Img_path
: image pathlabel_0 label_1 ... label_n
: we support multi-label for a single image
-
class
owl.net.netio.
LMDBDataProvider
(data_param, transform_param, mm_batch_num)[source]¶ Class for LMDB Data Provider.
Note
Layer type in Caffe’s configure file: DATA
-
get_multiview_mb
()[source]¶ Multiview testing will get better accuracy than single view testing. For each image, it will crop out the left-top, right-top, left-down, right-down, central patches and their hirizontal flipped version. The final prediction is averaged according to the 10 views. Thus, for each original batch, get_multiview_mb will produce 10 consecutive batches for the batch.
-