These are my own ideas of how might the neural network architecture may be improved, as a note reminding my further implementation of them
Measure for Overfitting
TODO: Add measure in respect to acc.
TODO: Compare $O$ with normal CNN/DNN.
We does need some measure to see if/how the network is overfitting the training data.
My first thought is just use $O=LOSS(X_{train})-LOSS(X_{test})$. This seems to be working, and the value stays at around $0\pm0.04$, which means sometimes the network performs better in the training set($O>0$) and sometimes the opposite($)<0$).
However, after accuracy reaches $99.2\%$, the frequency of observing a positive overfit index $O$ increases and the average index also increase.
To compare this index with the overfitting condition of other networks, I need to first somehow add the accuracy into the calculation of the index and then compare them on other architectures.
2017/11/21
Generator For Cheating a Network
What would happen if we first train a classifier to classify image, and then train another network where it distort an input image to maximize the change in the estimation done by the fore-mentioned classifier at the same time minimize the distortion to the image?
This could be implemented using a new cost function for the second network.
Implementing.
2017/10/27
I failed to implement this structure in tensorflow in that rather than an actual value, I seems to be getting the uninitialized tensor representation of the output rather than the actual value and I failed to replace the input of the original network with that tensor.
2017/10/29
I found the Generative Adversarial Network (GAN) model which there isn’t that much of a difference between my idea and the model. However, using another network to probe the weak points of an existing network is likely to be a possible idea of improving it.
2017/10/30
This looks like a possible way of modifying the GAN model in that while the discriminator and the generator in the original GAN model works against each other, maybe we can make the discriminator to to create augmented data where it previously wrongly classified. Testing.
2017/10/31
I’m pretty sure this will work, but there are more to it. If I am to train a generator to modify the minimum amount to a picture and at the same time cheating the discriminator, this may work. However, if I am to use the modified image to retrain the network, things may not be as the same… A badly written 6
might be twisted into something like 8
from even human perspective and training the discriminator with the image of 8
and label of 6
is not likely to improve its performance.
However, adding the regularization term forcing the generated image to be different from the training set may be a good idea for generators to generate hopefully more creative samples. To be tested, too.
2017/11/1
This is indeed different from regular gan. It looks working but further experiments are to be done.
2017/11/2
Seems working. Work hanged for something else.
2017/11/8
This is called Generative Poisoning Attack
and extensive research seems to has been performed on it.
2017/11/8
Ambiguity Measure for Dropout
Since increasing ambiguity is a good thing to do in ensembled classifiers, and dropout is essentially a cheap way to ensemble networks, why don’t we add another regularization term in the cost function to increase the ambiguity while dropout?
2017/11/1
Certainty Measure for the Result
According to my observation, it’s weird that a classifier networks always give a clear indication of the result and there is no indication for “Uncertain”. In another word, it seems unlikely for multiple/none of the output nodes to fire at the same time.
The above may not necessarily be true, but it’s obvious that when a classifier network is given a sample that is totally different from any of the samples in the training set, the result tends to be coming from the weighted combination of the training examples. Will it be possible if we add another class to contain everything else and feed in random noises?
Unlikely to be useful.
2017/11/1
Recursive Conv Net
What would happen is we make a convolution neural network recursive? Will this somehow benefit video generation?
To be tested.
2017/10/27
It exists…
2017/10/30
Dynamic Cost Function
Is it possible if we make the cost function dynamic and change according to the condition so the convergence could be faster? Or is it possible to even use RNN to generate cost function? How would that help? What change would that bring about?
To be tested.
2017/10/27
Flip network
According to here, it looks like computers may have different speed between adding and subtracting numbers. If we flip the network so all addition became subtraction and vise versa (Then we might be maximizing the “cost” function), will it be any faster?
To be tested.
2017/10/27