In a pad template capability "video/x-raw(ANY)", what does the ANY part means?

When I gst-inspect a element I’m interested about to see the video formats available, the Pad templates shows an entry for video/x-raw and another one for video/x-raw(ANY):

  SINK template: 'video_sink'
    Availability: Always
                 format: { (string)BGRx, (string)RGBx, (string)xRGB, (string)xBGR, (string)RGBA, (string)BGRA, (string)ARGB, (string)ABGR, (string)RGB, (string)BGR, (string)I420, (string)YV12, (string)AYUV, (string)YUY2, (string)UYVY, (string)v308, (string)Y41B, (string)Y42B, (string)Y444, (string)NV12, (string)NV21, (string)A420, (string)YUV9, (string)YVU9, (string)IYU1, (string)GRAY8 }

                 format: { (string)I420, (string)YV12, (string)YUY2, (string)UYVY, (string)AYUV, (string)RGBx, (string)BGRx, (string)xRGB, (string)xBGR, (string)RGBA, (string)BGRA, (string)ARGB, (string)ABGR, (string)RGB, (string)BGR, (string)Y41B, (string)Y42B, (string)YVYU, (string)Y444, (string)v210, (string)v216, (string)NV12, (string)NV21, (string)GRAY8, (string)GRAY16_BE, (string)GRAY16_LE, (string)v308, (string)RGB16, (string)BGR16, (string)RGB15, (string)BGR15, (string)UYVP, (string)A420, (string)RGB8P, (string)YUV9, (string)YVU9, (string)IYU1, (string)ARGB64, (string)AYUV64, (string)r210, (string)I420_10BE, (string)I420_10LE, (string)I422_10BE, (string)I422_10LE, (string)Y444_10BE, (string)Y444_10LE, (string)GBR, (string)GBR_10BE, (string)GBR_10LE, (string)NV16, (string)NV24, (string)NV12_64Z32, (string)A420_10BE, (string)A420_10LE, (string)A422_10BE, (string)A422_10LE, (string)A444_10BE, (string)A444_10LE, (string)NV61, (string)P010_10BE, (string)P010_10LE, (string)IYU2, (string)VYUY, (string)GBRA, (string)GBRA_10BE, (string)GBRA_10LE, (string)GBR_12BE, (string)GBR_12LE, (string)GBRA_12BE, (string)GBRA_12LE, (string)I420_12BE, (string)I420_12LE, (string)I422_12BE, (string)I422_12LE, (string)Y444_12BE, (string)Y444_12LE, (string)GRAY10_LE32, (string)NV12_10LE32, (string)NV16_10LE32 }

I understand the video/x-raw means the pad data is a raw video format.
I do not understand what the (ANY) part means on the second video/x-raw entry.
What is it saying? What is the difference for the entry without it?

The format capabilities in video/x-raw are a sub-set of those in the video/x-raw(ANY). Why is that?

In my pipeline, a source element provides video/x-raw in NV16 format. If I try to link it to the element above, it fails, despite of NV16 present in the format list of video/x-raw(ANY). If before this element I insert a videoconvert and convert the format to NV12 (present in the video/x-raw format list), then the elements are linked and the pipeline works. Why I cannot link the sink pad and the source pad in my pipeline even though the NV16 format is listed there?

The bit in parentheses after a media type is called “caps features”, see GstCapsFeatures

If it’s absent for video/x-raw the assumption is that this is raw video data in system memory.

In advanced pipelines, e.g. with hardware-acceleration in the mix, you might get raw video data backed by special types of memory though, which would be specified via caps features, so that only elements that can handle that special type of memory will receive such data.

The video/x-raw(ANY) here means that the element doesn’t actually care about the underlying memory, most likely because it doesn’t actually look at the data, but just passes through buffers and maybe looks at timestamps or so.

The videorate element is an example of this.