intrinsic parameters contain both fx fy cx cy and skew with additional distortion parameters k1-k5 r1-r2.
Assuming you have no distortion and cx and cy are perfectly in the center. Image origin at top left as a normal understanding of the image. As you say you know some ground truth level 3D points.3D measurements are with respect to camera optical axis. Then this 3D point P can be projected into camera image plane called p. The P p O(the camera optical center) with center lines forms isosceles triangle.
fx / (p_x-cx) = P_z / P_x
fx = (p_x-cx) * P_z / P_x
The same goes for the fy. and usually fx and fy are the same.
This is under the perfect assumption that you don't have distortion on camera. If you start to have distortion, then you need to find enough sample points all over the image to form distortion understanding as shown below. One or 2 points won't give you the whole picture understanding.
There are some cheats in some papers that using sea vanishing lines(see ref, it is a series of works) or perfect 3D building vanishing points to detect the distortion. We start from extrinsic to intrinsic and it can get some good guess after some trial eventually. But it is very much in research and can not apply to general cases.
Ref: Han Wang, Wei Mou, Xiaozheng Mou, Shenghai Yuan, Soner Ulun, Shuai Yang and Bok-Suk Shin, An Automatic Self-Calibration Approach for Wide Baseline Stereo Cameras Using Sea Surface Images, unmanned system
与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…