tensorflow/core/profiler/g3doc/profile_model_architecture.md
<b>Notes:</b>
VariableV2 operation type might contain variables created by TensorFlow
implicitly. User normally don't want to count them as "model capacity".
We can use customized operation type to select a subset of variables.
For example _trainable_variables is created automatically by tfprof Python
API. User can also define customized operation type.
# parameters are created by operation type 'VariableV2' (For older model,
# it's 'Variable'). scope view is usually suitable in this case.
tfprof> scope -account_type_regexes VariableV2 -max_depth 4 -select params
_TFProfRoot (--/930.58k params)
global_step (1/1 params)
init/init_conv/DW (3x3x3x16, 432/864 params)
pool_logit/DW (64x10, 640/1.28k params)
pool_logit/DW/Momentum (64x10, 640/640 params)
pool_logit/biases (10, 10/20 params)
pool_logit/biases/Momentum (10, 10/10 params)
unit_last/final_bn/beta (64, 64/128 params)
unit_last/final_bn/gamma (64, 64/128 params)
unit_last/final_bn/moving_mean (64, 64/64 params)
unit_last/final_bn/moving_variance (64, 64/64 params)
# The Python API profiles tf.trainable_variables() instead of VariableV2.
#
# By default, it's printed to stdout. User can update options['output']
# to write to file. The result is always returned as a proto buffer.
param_stats = tf.profiler.profile(
tf.get_default_graph(),
options=tf.profiler.ProfileOptionBuilder
.trainable_variables_parameter())
sys.stdout.write('total_params: %d\n' % param_stats.total_parameters)
For an operation to have float operation statistics:
It must have RegisterStatistics('flops') defined in TensorFlow. tfprof
uses the definition to calculate float operations. Contributions are
welcomed.
It must have known "shape" information for RegisterStatistics('flops') to
calculate the statistics. It is suggested to pass in -run_meta_path if
shape is only known during runtime. tfprof can fill in the missing shape
with the runtime shape information from RunMetadata. Hence, it is suggested
to use -account_displayed_op_only option so that you know the statistics
are only for the operations printed out.
If no RunMetadata is provided, tfprof counts float_ops of each graph node once, even if it is defined in a tf.while_loop. This is because tfprof doesn't know statically how many times each graph node is run. If RunMetadata is provided, tfprof calculates float_ops as float_ops * run_count.
# To profile float operations in commandline, you need to pass --graph_path
# and --op_log_path.
tfprof> scope -min_float_ops 1 -select float_ops -account_displayed_op_only
node name | # float_ops
_TFProfRoot (--/17.63b flops)
gradients/pool_logit/xw_plus_b/MatMul_grad/MatMul (163.84k/163.84k flops)
gradients/pool_logit/xw_plus_b/MatMul_grad/MatMul_1 (163.84k/163.84k flops)
init/init_conv/Conv2D (113.25m/113.25m flops)
pool_logit/xw_plus_b (1.28k/165.12k flops)
pool_logit/xw_plus_b/MatMul (163.84k/163.84k flops)
unit_1_0/sub1/conv1/Conv2D (603.98m/603.98m flops)
unit_1_0/sub2/conv2/Conv2D (603.98m/603.98m flops)
unit_1_1/sub1/conv1/Conv2D (603.98m/603.98m flops)
unit_1_1/sub2/conv2/Conv2D (603.98m/603.98m flops)
# Some might prefer op view that aggregate by operation type.
tfprof> op -min_float_ops 1 -select float_ops -account_displayed_op_only -order_by float_ops
node name | # float_ops
Conv2D 17.63b float_ops (100.00%, 100.00%)
MatMul 491.52k float_ops (0.00%, 0.00%)
BiasAdd 1.28k float_ops (0.00%, 0.00%)
# You can also do that in Python API.
tf.profiler.profile(
tf.get_default_graph(),
options=tf.profiler.ProfileOptionBuilder.float_operation())