CNN demands lot of data while capsule networks does not require that much data as compare to CNN. CNN are Invariant of Translation which means that they lack to identify the position on an object to another, that is why capsule networks are used because they have the capability to identify the position of a single object.
They can be used as -