CNN训练MNIST数据集tenflow2(下)
接下来利用tflite做量化,其实中间的数据流到底有没量化,我也不清楚。。。
一、数据准备
# 1.数据准备
import tensorflow as tf
import numpy as np
mnist = tf.keras.datasets.mnist
img_rows,img_cols = 28,28
(x_train_, y_train_), (x_test_, y_test_) = mnist.load_data()
x_train = x_train_.reshape(x_train_.shape[0],img_rows,img_cols,1)
x_test = x_test_.reshape(x_test_.shape[0],img_rows,img_cols,1)
x_train = x_train.astype('float32')
x_test = x_test.astype('float32')
x_train = x_train / 256
x_test = x_test / 256
y_train_onehot = tf.keras.utils.to_categorical(y_train_)
y_test_onehot = tf.keras.utils.to_categorical(y_test_)
二、初始模型导入
model = tf.keras.Sequential()
model = tf.keras.models.load_model('models/mnist_tf2_fw.h5')
score = model.evaluate(x_test, y_test_onehot, verbose=0)
print('Test accuracy:', "{:.5f}".format(score[1]))
三、量化模型导入
interpreter= tf.lite.Interpreter(model_path="models/tflite_tf2_8.tflite")
#interpreter= tf.lite.Interpreter(model_path="models/tflite_tf2_dy.tflite")
#interpreter= tf.lite.Interpreter(model_path="models/tflite_tf2_16.tflite")
#interpreter= tf.lite.Interpreter(model_path="models/tflite_tf2_32.tflite")
interpreter.allocate_tensors()
input_details=interpreter.get_input_details()
output_details=interpreter.get_output_details()
#print(input_details)
#print(output_details)
y_lebal=y_test_onehot.argmax(1)
# 量化模型推理
pre = []
# tflite针对安卓端运用,每次只能推理一个数据
for i in range (len(x_test)):
# .astpe 非常重要,找了很久才解决这个bug
x_test1 = x_test[i].reshape(1,28,28,1).astype(np.float32)
interpreter.set_tensor(input_details[0]['index'],x_test1)
interpreter.invoke()
output_data = interpreter.get_tensor(output_details[0]['index'])
pre.append(output_data.argmax())
temp = (pre==y_lebal)
acc=sum(temp)/10000
print(acc)
32bit | 16bit | 8bit | 默认 |
---|---|---|---|
98.500 | 98.500 | 98.440 | 98.440 |
这结果似乎和权重量化的差不多。数据流量化确实不太容易,需要更加底层、细粒度的操作。只有rram的训练有可能是4bit和8bit的混合,所以低精度数据流的训练还是挺重要的。
后期计划,做低比特、数据流量化的调用,看看别人的工作,为rram高性能计算打下理论基础。 github 代码