There’s no policy for that. As long as there’s runtime CPU feature detection, SIMD assembly optimizations wouldn’t be a problem at all. You can find some e.g. in the audio converter for resampling.
In most other places we use ORC, which is a simple data processing language that is then JIT-compiled to SIMD assembly. This is used in the video converter, for example, but the language is not powerful enough to express matrix multiplication.
For the function you mention here, specifically, I think it would be great to have SIMD optimizations. Not even necessarily AVX, SSE would be a big improvement too already. If you want to give it a try, please just go ahead For runtime CPU feature detection you could still use ORC there, just like it is done for the resampler.
FWIW, ORC currently only has SSE support (on x86) but AVX support is on the way.