Penerapan Knowledge Distillation terhadap Arsitektur Convolu...

Perpustakaan judul masih dalam tahap pengembangan, admin siap menampung kritik dan saran

Penerapan Knowledge Distillation terhadap Arsitektur Convolutional Neural Network 1D CNN dan 3D CNN

Muhammad Taufiq Pratama (2018) | Skripsi | Teknik Informatika , Teknik Komputer

Ringkasan

Pada dataset yang berjumlah masif, deep neural network dengan struktur hidden layer yang kompleks seperti convolutional neural network (CNN) memiliki akurasi yang lebih tinggi dari neural network yang hanya memiliki satu hidden layer. Namun di sisi lain, struktur yang kompleks berdampak pada meningkatnya runtime inferensi dan storage yang dibutuhkan oleh model tersebut. Beberapa penelitian telah dilakukan untuk mengurangi runtime dan storage yang dibutuhkan model deep neural network. Salah satunya adalah penelitian terkait penerapan metode knowledge distillation. Metode ini dapat mengurangi runtime training dan inferensi dari model dengan arsitektur CNN dua dimensi (2D CNN), dengan trade-off berupa reduksi akurasi yang tidak signifikan. Metode ini belum dicobakan terhadap arsitektur CNN dengan dimensi yang berbeda, seperti 1D CNN dan 3D CNN. Sehingga, penelitian ini mengoservasi performa metode knowledge distillation ketika diterapkan pada arsitektur 1D CNN dan 3D CNN. Hasil penelitian menunjukkan bahwa model hasil distilasi 1D CNN memiliki runtime training dan inferensi yang lebih singkat dengan reduksi akurasi sebesar 17,44% akurasi dibanding model aslinya. Model hasil distilasi 3D CNN memiliki runtime runtime inferensi yang lebih singkat, namun dengan runtime training yang lebih lama pada beberapa kasus, dengan reduksi akurasi sebesar 5,83% dibanding model aslinya. Kata Kunci: 1D CNN, 3D CNN, Convolutional Neural network, Knowledge Distillation, Teacher-student Strategy.

Ringkasan Alternatif

In a large dataset, deep neural network with complex hidden layer structure, such as convolutional neural network (CNN), has been proven to be able to beat the accuracy of neural network with only one hidden layer. On the other hand, the rise in accuracy is also followed by the rise in runtime and storage needed by the model. There are studies done about reducing both runtime and storage required by deep neural network, such as application of knowledge distillation method. This method proves to be able to reduce training and inference runtime needed by twodimensional CNN (2D CNN), with reduction of accuracy as a trade-off. However, this method has not been tried on CNN architecture with different dimension, such as 1D CNN and 3D CNN. Hence, this research tries to observe the performance of knowledge distillation method applied to 1D CNN and 3D CNN architecture. The result shows that distilled 1D CNN model can shorten both training and inference runtime, with 17,44% reduction in accuracy compared to the original model. Distilled 3D CNN model can shorten the inference runtime yet with slower training runtime in some cases, with 5,83% reduction in accuracy compared to the original model. Keywords: 1D CNN, 3D CNN, Convolutional Neural network, Knowledge distillation, Teacher-student Strategy.

Sumber

http://digilib.polban.ac.id/gdl.php?mod=browse&op=read&id=jbptppolban-gdl-muhammadta-9715

Teknik Informatika Teknik Komputer