1.【Pytorch编程】张量的概念、构建方法与存储方式理解-意大利世界杯夺冠-体操世界杯_世界杯足球宝贝

1.【Pytorch编程】张量的概念、构建方法与存储方式理解

我的博客专栏Pytorch编程系列文章。Python环境配置参考《【Python学习】Windows10开始你的Anaconda安装与Python环境管理》或者《【Python学习】纯终端命令开始你的Anaconda安装与Python环境管理》。

作者: 陈艺荣

代码环境： Python3.6、Pytorch1.4.0、jupyter notebook

本节内容参考

莫凡Pytorch|Torch 或 Numpy

pytorch123|什么是 PyTorch?

pytorch官网|Tensors

笔记 | 什么是张量（tensor）& 深度学习

查看代码环境版本

import torch

print(torch.__file__) # 查看安装位置

print(torch.__version__) # 查看版本号

print(torch.cuda.is_available()) # 查看CUDA版本是否可用

/home/phd-chen.yirong/anaconda3/envs/py36/lib/python3.6/site-packages/torch/__init__.py

1.4.0

True

张量的基本概念

张量的英文名为Tensor，例如知名的深度学习框架TensorFlow就是Tensor+mermaid flowchat,也就是张量流的意思。

要理解张量，首先要对数据结构与算法有初步的认识。

Python当中的数据结构

Python当中经典的数据结构便是列表，基于列表还可以实现队列、栈等经典数据结构。简单来说，数据结构就是我们人为约定的存储、组织数据的方式，例如队列的“先进先出”，堆栈的“先进后出”等等。

参考：https://docs.python.org/zh-cn/3/tutorial/datastructures.html

# 列表

list1 = [1,2,3,4,5,6,7,8]

print(list1)

print(type(list1))

## 列表末尾添加元素

list1.append(9)

print(list1)

[1, 2, 3, 4, 5, 6, 7, 8]

[1, 2, 3, 4, 5, 6, 7, 8, 9]

NumPy当中的Ndarray数据结构

Ndarray，其全称是N-dimensional array，翻译过来就是N维的数组。数组也是一种数据结构，其表示为相同数据类型的元素的集合。因此，如果我们约定Python的列表对象的每一个元素的数据类型必须相同，那么这个列表就可以视为一个数组了。Ndarray就是这样一种特殊的数据结构。具体来说， Ndarray内部由以下内容组成：

一个指向数据（内存或内存映射文件中的一块数据）的指针。

数据类型或 dtype，描述在数组中的固定大小值的格子。

一个表示数组形状（shape）的元组，表示各维度大小的元组。

一个跨度元组（stride），其中的整数指的是为了前进到当前维度下一个元素需要"跨过"的字节数。

ndarray 对象由计算机内存的连续一维部分组成，并结合索引模式，将每个元素映射到内存块中的一个位置。

参考：https://www.runoob.com/numpy/numpy-ndarray-object.html

创建一个Ndarray对象只需调用NumPy的array函数即可：

numpy.array(object, dtype = None, copy = True, order = None, subok = False, ndmin = 0)

import numpy as np

list1 = [1,2,3,4,5,6,7,8]

array1 = np.array(list1)

print(array1)

print(type(array1))

[1 2 3 4 5 6 7 8]

从Ndarray到Pytorch的张量Tensor数据结构

这里有一个历史背景，很多科研工作者把Pytorch视为Numpy的替代品，也就是使用PyTorch利用GPU的性能进行计算。

这里科普两个概念：CPU与GPU

CPU: 中央处理器（central processing unit，简称CPU）作为计算机系统的运算和控制核心，是信息处理、程序运行的最终执行单元。

GPU: 图形处理器（英语：graphics processing unit，缩写：GPU），又称显示核心、视觉处理器、显示芯片，是一种专门在个人电脑、工作站、游戏机和一些移动设备（如平板电脑、智能手机等）上做图像和图形相关运算工作的微处理器。

它们的一些区别：CPU：计算量小，原理：只有4个运算单元，可计算复杂的运算，对于多个1+1算术题的计算速度慢；GPU：计算量大，原理：有1000个运算单元，只可以计算简单的1+1算术题，对于多个1+1算术题的计算速度快。因此，往往那些需要大量并行计算的操作，使用GPU会有优势。

张量便诞生于这种计算机发展背景。

参考：https://zhuanlan.zhihu.com/p/156171120?utm_source=wechat_session

我们可以这样简单理解：Tensor类似于NumPy的Ndarray，同时Tensor可以使用GPU进行计算。因此Tensor的很多性质和方法都和Ndarray类似。

张量的常用构建方法

import torch

import numpy as np

torch.tensor

torch.tensor(data, *, dtype=None, device=None, requires_grad=False, pin_memory=False) → Tensor

其中，

data可以是list, tuple, NumPy ndarray, scalar等数据类型。

dtype指定了所创建的张量的元素的数据类型，可以是torch.dtype的任意一种，常用的有torch.float、torch.double、torch.int

device指定了张量所分配的设备，'cpu'或者'cuda'、'cuda:0'、...

requires_grad则指定了张量是否记录自动梯度运算的结果。

pin_memory则指定张量分配到固定内存中。该参数仅适用于CPU张量。

从torch.tensor的形式参数，其实我们可以发现张量的一些特点：数组类型、需要指定设备、能够支持梯度运算并记录梯度运算值。

tensor1 = torch.tensor([[1, 2], [3, 4]])

print(tensor1)

print(tensor1[1])

print(tensor1[0,0])

print(type(tensor1))

tensor([[1, 2],

[3, 4]])

tensor([3, 4])

tensor(1)

张量的属性查看：

print(tensor1.shape)

print(tensor1.dtype)

print(tensor1.device)

torch.Size([2, 2])

torch.int64

cpu

指定张量的数据类型、设备信息：

tensor2 = torch.tensor([[1, 2], [3, 4]],dtype=torch.float,device=torch.device('cuda'))

print(tensor2)

print(tensor2[1])

print(tensor2[0,0])

print(type(tensor2))

tensor([[1., 2.],

[3., 4.]], device='cuda:0')

tensor([3., 4.], device='cuda:0')

tensor(1., device='cuda:0')

特殊的张量：

# 0维张量（也叫做标量）

tensor3 = torch.tensor(2022,dtype=torch.int,device=torch.device('cuda'))

print(tensor3)

# Use torch.Tensor.item() to get a Python number from a tensor containing a single value

print(tensor3.item())

tensor(2022, device='cuda:0', dtype=torch.int32)

2022

# 空张量

tensor4 = torch.tensor([],dtype=torch.int,device=torch.device('cuda'))

print(tensor4)

tensor([], device='cuda:0', dtype=torch.int32)

torch.from_numpy

从NumPy.ndarray类型数据创建张量，返回的张量和ndarray共享同一内存。

对张量的修改将反映在ndarray中，反之亦然。返回的张量是不能调整大小的。

np_array = np.array([[1,2,3,4,5,6],[1,2,3,4,5,6]])

x_np = torch.from_numpy(np_array)

print(x_np)

tensor([[1, 2, 3, 4, 5, 6],

[1, 2, 3, 4, 5, 6]])

参考另一个张量的维度，创建全0或者全1或者随机初始化的张量

x = torch.tensor([[1,2,3],[3,2,1]],dtype=torch.float,device=torch.device("cuda"))

print(x)

# 创建全为1张量，维度等与张量x相同

x_ones = torch.ones_like(x)

print(x_ones)

# 创建全为0张量，维度等与张量x相同

x_zeros = torch.zeros_like(x)

print(x_zeros)

# 创建随机设置元素的张量，维度等与张量x相同

x_rand = torch.rand_like(x, dtype=torch.float)

print(x_rand)

tensor([[1., 2., 3.],

[3., 2., 1.]], device='cuda:0')

tensor([[1., 1., 1.],

[1., 1., 1.]], device='cuda:0')

tensor([[0., 0., 0.],

[0., 0., 0.]], device='cuda:0')

tensor([[0.9553, 0.0353, 0.3714],

[0.2786, 0.4967, 0.8638]], device='cuda:0')

指定张量维度，创建随机张量或者由常数组成的张量

shape = (2,2)

rand_tensor = torch.rand(shape)

print(rand_tensor)

ones_tensor = torch.ones(shape)

print(ones_tensor)

zeros_tensor = torch.zeros(shape)

print(zeros_tensor)

tensor([[0.7727, 0.7495],

[0.1478, 0.7323]])

tensor([[1., 1.],

[1., 1.]])

tensor([[0., 0.],

[0., 0.]])

张量的基本操作

1、张量的.to操作

device = "cuda" if torch.cuda.is_available() else "cpu"

tensor5 = torch.tensor([[1, 2], [3, 4]])

tensor5 = tensor5.to(device)

print(tensor5)

tensor([[1, 2],

[3, 4]], device='cuda:0')

tensor5 = tensor5.to("cpu")

print(tensor5)

tensor([[1, 2],

[3, 4]])

2、张量的元素索引

tensor6 = torch.rand(3, 3)

print(tensor6)

print(f"First row: {tensor6[0]}")

print(f"First column: {tensor6[:, 0]}")

print(f"Last column: {tensor6[..., -1]}")

print(f"Last column: {tensor6[:, -1]}")

tensor6[:,1] = 1

print(tensor6)

tensor([[0.4778, 0.4452, 0.2678],

[0.2579, 0.0155, 0.9202],

[0.5348, 0.2705, 0.2810]])

First row: tensor([0.4778, 0.4452, 0.2678])

First column: tensor([0.4778, 0.2579, 0.5348])

Last column: tensor([0.2678, 0.9202, 0.2810])

tensor([[0.4778, 1.0000, 0.2678],

[0.2579, 1.0000, 0.9202],

[0.5348, 1.0000, 0.2810]])

3、张量的拼接

在处理特征融合时常用到该操作，例如：把三个模型的特征拼接到一起然后输入到另一个模型，实现特征融合。

tensor7 = torch.tensor([[1, 2], [3, 4]])

tensor3 = torch.cat([tensor7, tensor7, tensor7], dim=1)

print(tensor3)

tensor8 = torch.cat([tensor7, tensor7, tensor7], dim=0) # 最外层进行拼接

print(tensor8)

tensor9 = torch.cat([tensor7, tensor7, tensor7], dim=-1) # 最内层进行拼接

print(tensor9)

tensor([[1, 2, 1, 2, 1, 2],

[3, 4, 3, 4, 3, 4]])

tensor([[1, 2],

[3, 4],

[1, 2],

[3, 4],

[1, 2],

[3, 4]])

tensor([[1, 2, 1, 2, 1, 2],

[3, 4, 3, 4, 3, 4]])

通过view函数理解张量的存储

值得注意的是，pytorch与numpy在存储MxN的数组时，均是按照行优先将数组拉伸至一维存储，比如对于一个二维张量

t = torch.tensor([[1, 3, 5], [2, 4, 6]])

在内存中实际上是

[1, 3, 5, 2, 4, 6]

这样存储的。

view()函数：并没有改变张量在内存中真正的形状，使用view函数后，通常会使得张量的数字在语义上是连续的，但在内存上是不连续的。

在pytorch中view函数的作用为重构张量的维度，相当于numpy中resize的功能。返回的张量共享相同的数据，必须具有相同的元素数，但可能具有不同的大小。对于要view的张量，view后的张量尺寸必须与其原始尺寸以及维度兼容。

view方法只适用于满足连续性条件的tensor，并且该操作不会开辟新的内存空间，只是产生了对原存储空间的一个新别称和引用，返回值是视图。

t0 = torch.randn(4, 4)

print("t0=",t0)

print(t0.size())

t1 = t0.view(16)

print("t1=",t1)

print(t1.size())

t2 = t0.view(2, 8) # the size -1 is inferred from other dimensions

print("t2=",t2)

print(t2.size())

t0= tensor([[ 0.0713, 0.2641, -1.6546, 1.0520],

[ 2.0892, 0.9782, -0.6922, 0.8448],

[-1.9219, -0.0295, 0.6358, -1.1346],

[ 0.3436, 0.2619, 0.2935, -2.4253]])

torch.Size([4, 4])

t1= tensor([ 0.0713, 0.2641, -1.6546, 1.0520, 2.0892, 0.9782, -0.6922, 0.8448,

-1.9219, -0.0295, 0.6358, -1.1346, 0.3436, 0.2619, 0.2935, -2.4253])

torch.Size([16])

t2= tensor([[ 0.0713, 0.2641, -1.6546, 1.0520, 2.0892, 0.9782, -0.6922, 0.8448],

[-1.9219, -0.0295, 0.6358, -1.1346, 0.3436, 0.2619, 0.2935, -2.4253]])

torch.Size([2, 8])

t3 = t.view(-1) # 等价于t.view(16)

print("t3=",t3)

print(t3.size())

t3= tensor([1, 3, 5, 2, 4, 6])

torch.Size([6])

使用view函数后，通常会使得张量的数字在语义上是连续的，但在内存上是不连续的，此时可以利用.contiguous()函数保证新的张量在语义上和内存上都是连续的。.contiguous()方法首先拷贝了一份张量在内存中的地址，然后将地址按照形状改变后的张量的语义进行排列。

t4 = torch.randn(3, 4)

print("t4=",t4)

t5 = t4.view(4, 3).contiguous()

print("t5=",t5)

print(t5.size())

t4= tensor([[ 1.4501, 0.0161, -0.0799, 0.8645],

[-0.2330, -0.8001, -1.6973, -0.2469],

[ 0.9000, 0.2703, -0.1075, 1.4058]])

t5= tensor([[ 1.4501, 0.0161, -0.0799],

[ 0.8645, -0.2330, -0.8001],

[-1.6973, -0.2469, 0.9000],

[ 0.2703, -0.1075, 1.4058]])

torch.Size([4, 3])

# [[1,3,5],

# [2,4,6]]

t6 = torch.tensor([[1, 3, 5], [2, 4, 6]])

t6_v = t6.view(-1)

print('t6.view(-1)=', t6_v) # [1, 3, 5, 2, 4, 6]

t6.view(-1)= tensor([1, 3, 5, 2, 4, 6])

下面的张量t7在语义上是这样的

[[1, 2],

[3, 4],

[5, 6]]

但是在内存上与t6一样，是这样的：

[1, 3, 5, 2, 4, 6]

如果要满足内存上的连续性，则应该这样存储：

[1, 2, 3, 4, 5, 6]

t7 = t.transpose(0, 1) # 内存上：[1, 3, 5, 2, 4, 6]

print('t7=', t7)

t7= tensor([[1, 2],

[3, 4],

[5, 6]])

# 不满足语义和存储上的连续一致性，无法使用view函数

print(t7.view(-1))

---------------------------------------------------------------------------

RuntimeError Traceback (most recent call last)

1 # 不满足语义和存储上的连续一致性，无法使用view函数

----> 2 print(t7.view(-1))

RuntimeError: view size is not compatible with input tensor's size and stride (at least one dimension spans across two contiguous subspaces). Use .reshape(...) instead.

t8 = t7.contiguous() # 内存上：[1, 2, 3, 4, 5, 6]

t8_v = t8.view(-1)

print('t8.view(-1)=', t8_v) # [1, 2, 3, 4, 5, 6]

t8.view(-1)= tensor([1, 2, 3, 4, 5, 6])