Modifying a full image into patch itself is not a complicated task, but there are some factors to consider.
Size of Patch
Size of Overlap (i.e. Stride)
In a sense the patches must include the proper size for the network to pick up distinct patterns for each category and a proper stride to reduce potential spatial information loss. For the optimal size for patch and stride, I’ve initially considered doing some trial and errors, but running a full scale Neural Network expects weeks of training. In the end, I’ve referenced successful patch based Convolution Neural Network researches with similar full image sized and based it on their numbers.
Patch Size 512 pixels
Stride Size 256 pixels
Here are snippets of codes to extract patches.
Patch Neural Network Architecture
After extracting patches, the overall structure is very similar to a regular neural network.
Here is the code snippet
As mentioned in the Methods Utilized post, each procedure has a BatchNorm2d (Batch Normalization). Due to overfitting when tested, double downsampling was also integrated. Note that the output if this network will be a feature map so that in the Level 2 - Image, so spatial information are no longer available after this level.
Moreover, it is common in neural network to utilize maxpooling, but I’ve used 2x2 convolution layer with a stride of 2 instead (they pretty much do the same thing, I’ve used it because with a subset it performed better in terms of accuracy). If you are familiar with machine learning, you would have noticed the log_softmax function. Since the network is trying to classify a non-binary model, it makes sense to use the SoftMax.
Note that after each epoch the model is saved into a Pytorch checkpoints file.
Comments