Yes you should be able to upgrade to 14-0
The possible problem with physical sector size boundary is not the the system partitions but the byte were /dev/loop1 starts (the offset) if that is not on a physical sector boundary then this problem may arise. I asked mijzelf if we should not use a multiple of 4096 + 1, he said no but maybe he is wrong.
In case writing does not start on a physical boundary then the writing of data to the disk is more time consuming because it has to adapt two sectors instead of one. read more here for example http://www.ibm.com/developerworks/linux ... tor-disks/
I am not sure if this explains what you observe but seeing that read speed is almost ok and write speed is significant reduced makes me think that we have an alignment issue here.