It is true that, with verbs of perception, both the infinitive and the V-ing form are possible.
In some cases, the infinitive can be used to refer to the whole action, from beginning to end, while the V-ing form can be used to refer to an instant during the performance of the action.
Why in some cases we use a bare infinitive :
- "Do you like seeing people compete on reality shows? if so, which ones?"
And sometimes we use gerund:
- "On reality shows, you often see people behaving badly. Do you think shows like that are a bad influence on society?"
In these cases in particular, it might be argued that your first sentence means that the whole competition is seen:
- Do you like seeing people as they compete on reality shows (from beginning to end)?
while your second sentence means that a moment of bad behavior is witnessed:
- On reality shows, you often see people in the act of behaving badly.
This difference might be clearer in other cases (these examples are the ones I used to give my students long ago):
- I saw him cross the street (from side to side)
- I saw him crossing the street (while he was crossing)
- I heard her sing a song (the complete song)
- I heard her singing a song (part of the song)