When I first searched for discard instruction, I've found experts saying using discard will result in performance drain. They said discarding pixels will break GPU's ability to use zBuffer properly because GPU have to first run Fragment shader for both objects to check if the one nearer to camera is discarded or not. For a 2D game I'm currently working on, I've disabled both depth-test and depth-write. I'm drawing all objects sorted by their depth and that's all, no need for GPU to do fancy things. now I'm wondering is it still bad if I discard pixels in my fragment shader?
As always for performance questions the most accurate answer is to try it out on your target hardware and measure what happens.
In your case it's probably not a bad thing to do. In fact there's a chance it will help performance by saving on memory bandwidth. It will also add shader instructions though, so it's not always a performance benefit.
Even when using the depth buffer the performance hit isn't always very significant, if you're careful with the order you draw things in.
There's a blog post at https://fgiesen.wordpress.com/2011/07/08/a-trip-through-the-graphics-pipeline-2011-part-7/ which describes in some detail how early depth testing might work in hardware, and what limitations there might be.
Graphics hardware can perform early depth-based culling of fragments before computing their color value (in other words, before running your fragment shader). Consequently, if you utilize any features that would affect that, such as
discard, alpha-testing, or manipulating
gl_FragDepth the hardware's ability to do that optimization will be compromised since the true depth of the fragment cannot be assumed and the full shader must be run.
Whether or not the use of any of those compromising features has a net observable performance impact depends on the situation, though. The early-z optimization can improve performance if you have very expensive fragment shaders, for example, but if the cost of your pipeline is in the vertex shader (or elsewhere) it won't benefit you as much, and consequently you may see little or no performance degradation by using
Disabling the depth test entirely via the API should prevent the optimization from running as well, since it could result in incorrectly-rendered scenes. In your case, then, it shouldn't matter that you use
Recent hardware can force the tests (including early stencil tests) using
layout(early_fragment_tests) -- there is more information (and caveats) on this on the page I linked in the beginning of the answer.