-
keras-3 reuter딥러닝/keras 2023. 5. 29. 12:43
1. 파이썬
from keras.datasets import reuters (train_data, train_labels),(test_data, test_labels) = reuters.load_data(num_words=10000)
로이터 데이터셋에서 가장 자주 등장하는 단어 10000개를 불러오고 훈련데이터와 테스트 데이터로 나눈다.
print(train_data.shape) print(test_data.shape) (8982,) (2246,)
데이터의 모양을 확인한다.
import numpy as np def vectorize_sequences(sequences, dimension=10000): results = np.zeros((len(sequences), dimension)) for i, sequence in enumerate(sequences): results[i, sequence] = 1 return results
입력 값을 벡터라이징해주는 함수를 만든다.
x_train = vectorize_sequences(train_data) x_test = vectorize_sequences(test_data)
훈련 데이터들을 벡터화한다.
from keras.utils.np_utils import to_categorical one_hot_train_labels = to_categorical(train_labels) one_hot_test_labels = to_categorical(test_labels)
라벨 데이터를 카테고리 데이터로 변환한다.
from keras import models from keras import layers model = models.Sequential() model.add(layers.Dense(64, activation='relu', input_shape=(10000,))) model.add(layers.Dense(64, activation='relu')) model.add(layers.Dense(46, activation='softmax'))
모델을 구성한다. 다중분류이므로 소프트맥스 함수를 사용하며 46차원을 출력한다.
model.compile(optimizer='rmsprop', loss='categorical_crossentropy', metrics=['accuracy'])
적절한 손실함수와, 옵티마이저, 매트릭을 지정한다.
x_val = x_train[:1000] partial_x_train = x_train[1000:] y_val = one_hot_train_labels[:1000] partial_y_train = one_hot_train_labels[1000:]
훈련 데이터를 슬라이싱하여 검증 데이터로 사용한다.
history = model.fit(partial_x_train, partial_y_train, epochs=20, batch_size=512, validation_data=(x_val, y_val)) Epoch 1/20 16/16 [==============================] - 2s 80ms/step - loss: 2.6554 - accuracy: 0.5159 - val_loss: 1.8070 - val_accuracy: 0.6120 Epoch 2/20 16/16 [==============================] - 1s 44ms/step - loss: 1.5197 - accuracy: 0.6849 - val_loss: 1.3832 - val_accuracy: 0.6890 Epoch 3/20 16/16 [==============================] - 1s 45ms/step - loss: 1.1713 - accuracy: 0.7534 - val_loss: 1.2160 - val_accuracy: 0.7310 Epoch 4/20 16/16 [==============================] - 1s 46ms/step - loss: 0.9638 - accuracy: 0.7943 - val_loss: 1.1071 - val_accuracy: 0.7570 Epoch 5/20 16/16 [==============================] - 1s 46ms/step - loss: 0.7970 - accuracy: 0.8324 - val_loss: 1.0314 - val_accuracy: 0.7800 Epoch 6/20 16/16 [==============================] - 1s 58ms/step - loss: 0.6684 - accuracy: 0.8571 - val_loss: 0.9952 - val_accuracy: 0.7820 Epoch 7/20 16/16 [==============================] - 1s 80ms/step - loss: 0.5594 - accuracy: 0.8794 - val_loss: 0.9362 - val_accuracy: 0.8010 Epoch 8/20 16/16 [==============================] - 1s 87ms/step - loss: 0.4675 - accuracy: 0.9018 - val_loss: 0.9394 - val_accuracy: 0.8030 Epoch 9/20 16/16 [==============================] - 1s 46ms/step - loss: 0.3972 - accuracy: 0.9157 - val_loss: 0.9118 - val_accuracy: 0.8210 Epoch 10/20 16/16 [==============================] - 1s 46ms/step - loss: 0.3390 - accuracy: 0.9286 - val_loss: 0.8808 - val_accuracy: 0.8230 Epoch 11/20 16/16 [==============================] - 1s 46ms/step - loss: 0.2901 - accuracy: 0.9360 - val_loss: 0.8911 - val_accuracy: 0.8190 Epoch 12/20 16/16 [==============================] - 1s 43ms/step - loss: 0.2541 - accuracy: 0.9436 - val_loss: 0.8970 - val_accuracy: 0.8180 Epoch 13/20 16/16 [==============================] - 1s 45ms/step - loss: 0.2222 - accuracy: 0.9474 - val_loss: 0.9230 - val_accuracy: 0.8050 Epoch 14/20 16/16 [==============================] - 1s 46ms/step - loss: 0.2038 - accuracy: 0.9483 - val_loss: 0.9114 - val_accuracy: 0.8100 Epoch 15/20 16/16 [==============================] - 1s 46ms/step - loss: 0.1798 - accuracy: 0.9523 - val_loss: 0.9428 - val_accuracy: 0.8220 Epoch 16/20 16/16 [==============================] - 1s 47ms/step - loss: 0.1710 - accuracy: 0.9529 - val_loss: 0.9348 - val_accuracy: 0.8210 Epoch 17/20 16/16 [==============================] - 1s 45ms/step - loss: 0.1586 - accuracy: 0.9540 - val_loss: 0.9408 - val_accuracy: 0.8130 Epoch 18/20 16/16 [==============================] - 1s 45ms/step - loss: 0.1455 - accuracy: 0.9565 - val_loss: 0.9382 - val_accuracy: 0.8120 Epoch 19/20 16/16 [==============================] - 1s 44ms/step - loss: 0.1418 - accuracy: 0.9563 - val_loss: 0.9560 - val_accuracy: 0.8170 Epoch 20/20 16/16 [==============================] - 1s 44ms/step - loss: 0.1325 - accuracy: 0.9567 - val_loss: 0.9721 - val_accuracy: 0.8180
20번 에포크 시킨다.
import matplotlib.pyplot as plt loss = history.history['loss'] val_loss = history.history['val_loss'] acc = history.history['accuracy'] val_acc = history.history['val_accuracy'] epochs = range(1, len(loss)+1) plt.plot(epochs, loss, 'bo', label='Training loss') plt.plot(epochs, val_loss, 'r-', label='Validation loss') plt.title('Training and validation loss') plt.xlabel('Epochs') plt.ylabel('Loss') plt.legend()
에포크에 따른 손실함수를 확인한다.
plt.plot(epochs, acc, 'bo', label='Training accuracy') plt.plot(epochs, val_acc, 'r-', label='Validation accuracy') plt.title('Training and validation accuracy') plt.xlabel('Epochs') plt.ylabel('Accuracy') plt.legend()
에포크에 따른 정확도를 확인한다. 9~10번째에서 오버피팅이 일어나는 것을 알 수 있다.
model = models.Sequential() model.add(layers.Dense(64, activation='relu', input_shape=(10000,))) model.add(layers.Dense(64, activation='relu')) model.add(layers.Dense(46, activation='softmax')) model.compile(optimizer='rmsprop', loss='categorical_crossentropy', metrics=['accuracy']) history = model.fit(partial_x_train, partial_y_train, epochs=10, batch_size=512, validation_data=(x_val, y_val)) eval_result = model.evaluate(x_test, one_hot_test_labels) print(eval_result) Epoch 1/10 16/16 [==============================] - 2s 61ms/step - loss: 2.8128 - accuracy: 0.4023 - val_loss: 1.8531 - val_accuracy: 0.6380 Epoch 2/10 16/16 [==============================] - 1s 48ms/step - loss: 1.5607 - accuracy: 0.6928 - val_loss: 1.3945 - val_accuracy: 0.6880 Epoch 3/10 16/16 [==============================] - 1s 48ms/step - loss: 1.1860 - accuracy: 0.7492 - val_loss: 1.1779 - val_accuracy: 0.7390 Epoch 4/10 16/16 [==============================] - 1s 46ms/step - loss: 0.9683 - accuracy: 0.7914 - val_loss: 1.0715 - val_accuracy: 0.7720 Epoch 5/10 16/16 [==============================] - 1s 45ms/step - loss: 0.7987 - accuracy: 0.8305 - val_loss: 1.0086 - val_accuracy: 0.7840 Epoch 6/10 16/16 [==============================] - 1s 61ms/step - loss: 0.6703 - accuracy: 0.8576 - val_loss: 0.9419 - val_accuracy: 0.8020 Epoch 7/10 16/16 [==============================] - 1s 82ms/step - loss: 0.5618 - accuracy: 0.8790 - val_loss: 0.9107 - val_accuracy: 0.8170 Epoch 8/10 16/16 [==============================] - 1s 77ms/step - loss: 0.4730 - accuracy: 0.9004 - val_loss: 0.9030 - val_accuracy: 0.8050 Epoch 9/10 16/16 [==============================] - 1s 44ms/step - loss: 0.4011 - accuracy: 0.9159 - val_loss: 0.8767 - val_accuracy: 0.8250 Epoch 10/10 16/16 [==============================] - 1s 48ms/step - loss: 0.3415 - accuracy: 0.9278 - val_loss: 0.8677 - val_accuracy: 0.8230 71/71 [==============================] - 0s 3ms/step - loss: 0.9412 - accuracy: 0.7912 [0.9411991834640503, 0.7911843061447144]
에포크를 10으로 설정하고 테스트를 진행하면 79퍼센트의 정확도를 얻을 수 있다.
'딥러닝 > keras' 카테고리의 다른 글
keras-6 cnn (0) 2023.05.29 keras-5 boston (0) 2023.05.29 keras-2 imdb (0) 2023.05.23 keras-1 mnist (0) 2023.05.23