The naive/basic camera space (view space) to screen space point transformation looks like
vec2 camToScreenSpace(vec4 p)
{
   vec4 v = projection_matrix * p;
   return v.xy / v.w; // do perspective divide
}
and a corresponding naive way to get a screen-space direction vector from a camera-space point and vector pair is
// Returns an unnormalised vector
vec2 camSpaceDirToScreenSpaceDir(vec4 p_cs, vec4 dir_cs)
{
   return camToScreenSpace(p_cs + dir_cs) - camToScreenSpace(p_cs);
}
camToScreenSpace takes something like 16 float multiplies and 12 additions for the matrix-vector mul, and 2 divides for the perspective divide. So camSpaceDirToScreenSpaceDir takes something like 32 multiplies, 24 additions/subtractions, and 4 divides. That's quite a lot of operations!

We can do much better!

If you use the standard perspective projection with OpenGL camera-space (view-space) coordinate convention (+x to the right, +y up, +z out of screen back from camera), camToScreenSpace looks like

vec2 fastCamToScreenSpace(vec4 pos_cs)
{
    return vec2(
	pos_cs.x / -pos_cs.z * l_over_w + 0.5,
	pos_cs.y / -pos_cs.z * l_over_h + 0.5
    );
}
Where l is the sensor to lens distance, and w is the sensor width for a thin-lens perspective camera. \(l/w\) is also equal to \(2 \tan(\alpha)\) where \(\alpha\) is half the horizontal angle of view.

In maths: $$ p_{ss_x} = {p_x \over -p_z} {l \over w} + {1 \over 2} \\ p_{ss_y} = {p_y \over -p_z} {l \over h} + {1 \over 2} $$ Where \(p_{ss}\) is a point in screen space and \(p\) is a point in camera space.

So how can we get the screen-space direction from this?

Consider a ray in camera space, parameterised by t, such that a point along the ray at distance t is given by $$ r(t) = p + t d $$ Then the direction in screen space is given by $$ {d \over dt} p_{ss} = ({d \over dt} p_{ss_x}, {d \over dt} p_{ss_y}) $$ Taking x coord first: $$ {d \over dt} p_{ss_x} = {d \over dt} [ {p_x \over -p_z} {l \over w} + {1 \over 2} ] $$ and substituting in our expression for the point on the ray \(p = r(t)\): $$ {d \over dt} p_{ss_x} = {d \over dt} [ {r(t)_x \over -r(t)_z} {l \over w} + {1 \over 2} ]\\ = -{l \over w} {d \over dt} [ {r(t)_x \over r(t)_z}] \\ $$ Now we can use the quotient rule: $$ {d \over dt} p_{ss_x} = -{l \over w} [ {{d \over dt} r(t)_x r(t)_z - r(t)_x {d \over dt} r(t)_z \over r(t)_z^2 } ] $$ And substituting in our expressions for \(r(t)\): $$ {d \over dt} p_{ss_x} = -{l \over w} [ {{d \over dt} (p_x + t d_x) (p_z + t d_z) - (p_x + t d_x) {d \over dt} (p_z + t d_z) \over (p_z + t d_z)^2 } ] \\ = -{l \over w} [ {d_x (p_z + t d_z) - (p_x + t d_x) d_z \over (p_z + t d_z)^2 } ] \\ = -{l \over w} [ {d_x p_z + t d_x d_z - p_x d_z - t d_x d_z \over (p_z + t d_z)^2 } ]\\ = -{l \over w} [ {d_x p_z - p_x d_z \over (p_z + t d_z)^2 } ] $$ Evaluated at \(t=0\) gives $$ {d \over dt} p_{ss_x} |_{t=0} = -{l \over w} [ {d_x p_z - p_x d_z \over p_z^2 } ] $$ Likewise for the y coord: $$ {d \over dt} p_{ss_y} |_{t=0} = -{l \over h} [ {d_y p_z - p_y d_z \over p_z^2 } ] $$ If we are just interested in the unnormalised direction vector, we can drop the common \( {1\over p_z^2} \) factor, giving the vector result: $$ d_{ss} = (-{l \over w} (d_x p_z - p_x d_z), -{l \over h} (d_y p_z - p_y d_z)) \\ = ({l \over w} (p_x d_z - d_x p_z), {l \over h} (p_y d_z - d_y p_z)) \\ $$ Or in code:

// Returns an unnormalised vector
vec2 fastCamSpaceDirToScreenSpaceDir(vec4 p, vec4 d)
{
    return vec2(
	(p.x * d.z - p.z * d.x) * l_over_w,
	(p.y * d.z - p.z * d.y) * l_over_h
    );
}
This obviously requires far fewer float operations - 6 multiplies and 2 subtractions.