It is not 1 + 1. It is 1 + 1 + a set/enclosure structure.
If you form two groups of 2 images, I do not expect a similar increase.
What happens if you add a drawing canvas into the mix?
Why is this of concern given that with most file structures there is no difference in storing a 2 Mb or a 5 Mb file? (Or is my understanding of file structures flawed? I haven't looked at this since the development of terabyte drives.)
|