基本配置
print(torch.version)
torch.set_num_threads(1) print(torch.config.parallel_info())
模型操作
- 模型文件
- 有多种不同的叫法,bin/pt/pth, 其本质都是zip压缩形式的pickle数据。
- 模型导出
- 模型加载
Tensor数据类型
https://pytorch.org/docs/stable/tensors.html torch.Tensor 根据dtype和device参数的不同,可以表达几十种数据类型
nn.Linear
nn.Embedding
Tensor 操作
生成:torch.empty, torch.tensor, torch.rand, torch.zeros
Resize: torch.view
reshape
CUDA Tensor: tensor.to可以将一个tensor转移到指定的device上
register_buffer
unsqueeze 在指定的位置增维度
einsum 爱因斯坦求和
bmm 三维tensor的矩阵乘法运算,矩阵必须是3维数据
masked_fill(mask, value) 将mask中为True的元素所对应的基础tensor元素设置为值value
tril 返回一个矩阵主对角线以下的下三角矩阵,其它元素全部为0
expand
view
permute
contiguous
custom PyTorch OP implementation
import torch
from torch.autograd import Function
from torch.utils.cpp_extension import load
# 加载自定义操作的C++扩展模块
custom_op = load(name='custom_op', sources=['custom_op.cpp'])
# 自定义操作的前向传播和反向传播函数
class CustomOpFunction(Function):
@staticmethod
def forward(ctx, input):
output = custom_op.custom_forward(input) # 调用C++扩展模块的前向传播函数
ctx.save_for_backward(input) # 保存输入,以备反向传播使用
return output
@staticmethod
def backward(ctx, grad_output):
input, = ctx.saved_tensors # 获取保存的输入
grad_input = custom_op.custom_backward(grad_output, input) # 调用C++扩展模块的反向传播函数
return grad_input
# 使用自定义操作
input = torch.randn(3, 3, requires_grad=True)
custom_op_function = CustomOpFunction.apply
output = custom_op_function(input)
loss = output.sum()
loss.backward()
print("Input:", input)
print("Output:", output)
print("Gradient:", input.grad)
自动求导
TORCH.FX
模型量化
- 分法1
- Dynamic Quantization
- Post-Training Static Quantization(PTQ)
- Quantization Aware Training(QAT)
- 分法2
- Eager Mode Quantization
- FX Graph Mode Quantization
Dynamic Quantization
在推理过程中跟踪输入数据的分布来动态地量化权重. 方法:weights quantized with activations read/stored in floating point and quantized for compute.
quantized_model = torch.quantization.quantize_dynamic(
model, {torch.nn.Linear}, dtype=torch.qint8
)
print(quantized_model)
Post-Training Static Quantization(PTQ)
# set quantization config for server (x86)
deploymentmyModel.qconfig = torch.quantization.get_default_config('fbgemm')
# insert observers
torch.quantization.prepare(myModel, inplace=True)
# Calibrate the model and collect statistics
# convert to quantized version
torch.quantization.convert(myModel, inplace=True)
Quantization Aware Training(QAT)
# specify quantization config for QAT
qat_model.qconfig=torch.quantization.get_default_qat_qconfig('fbgemm')
# prepare QAT
torch.quantization.prepare_qat(qat_model, inplace=True)
# convert to quantized version, removing dropout, to check for accuracy on each epoch
quantized_model=torch.quantization.convert(qat_model.eval(), inplace=False)