Because Object Position isn't surface direction.
Even if you did use object world position you'll get a single 3 vector representing the centerpoint of the object. Even if you were to normalize that vector to get a valid range for a cubemap lookup, you'd get a single color. Now if you want an object that changes color based on its direction from world center, then that may be perfect, but if you want reflection type effects you need to get the direction to sample the cubemap from on a per pixel or at least per vertex basis so that the sampled colors change across the surface of the object.
For a cubemap in worldspace, like one you get from a realtime render-to-texture cube, to accurately sample it you'd want worldspace normals of an object. The way the shader pipeline is written these aren't directly passed in, as most effects require tangent space data.. or rather most effects ASSUME tangent space data as they work on a per vertex or per fragment basis, and its the most logical space to work in for that.. like when you go to the corner store you think of walking forward the whole way, rather than charting your changes in azimuth and inclination vs the center of the earth. Same sorta reference frame goes for normal mapping on deformable objects and other things.