There are two striking features of language that any scientific theory of this quintessentially human behavior must account for. The first is that we do not all speak the same language. This would be a shocking observation were not so commonplace. Communication systems and other animals tend to be universal, with any animal of the species able to communicate with any other. Likewise, many other fundamental human attributes show much less variation. Barring genetic or environmental mishap, we all have two eyes, one mouth, and four limbs. Around the world, we cry when we are sad, smile when we are happy, and laugh when something is funny, but the languages we use to describe this are different.
The second striking feature of language is that when you consider the space of possible languages, most languages are clustered in a few tiny bands. That is, most languages are much, much more similar to one another than random variation would have predicted.
Starting with pioneering work by Joseph Greenberg, scholars have cataloged over two thousand linguistic universals (facts true of all languages) and biases (facts true of most languages). For instance, in languages with fixed word order, the subject almost always comes before the object. If the verb describes a caused event, the entity that caused the event is the subject ("John broke the vase") not the object (for example, "The vase shbroke John" meaning "John broke the vase"). In languages like English where the verb agrees with one of its subjects or objects, it typically agrees with the subject (compare "the child eats the carrots" with "the children eat the carrots") and not with its object (this would look like "the child eats the carrot" vs. "the child eat the carrots"), though in some languages, like Hungarian, the ending of the verb changes to match both the subject and object.
When I point this out to my students, I usually get blank stares. How else could language work? The answer is: very differently. Scientists and engineers have created hundreds of artificial languages to do the work of mathematics (often called "the universal language"), logic, and computer programming. These languages show none of the features mentioned above for the simplest of reasons: the researchers who invented these languages never bothered to include verb agreement or even the subject/object distinction itself.
Since we became aware of just how tightly constrained the variation in human language is, researchers have struggled to find an explanation. Perhaps the most famous account is Chomsky's Universal Grammar hypothesis, which argues that humans are born with innate knowledge about many of the features of language (e.g., languages distinguish subjects and objects), which would not only explain cross-linguistic universals but also perhaps how language learning gets off the ground in the first place. Over the years, Universal Grammar has become increasingly controversial for a number of reasons, one of which is the arbitrariness of the theory: The theory merely replaces the question of why we have the languages we have, and not others, with the question of why we have the Universal Grammar we have, and not another one.
As an alternative, a number of researchers have explored the possibility that some universals in language fall out of necessary design constraints. The basic idea is that some possible but nonexistent languages do not exist because they would simply be bad languages. There there are are no no languages languages in in which which you you repeat repeat every every word word. We don't need Universal Grammar to explain this; sheer laziness will suffice. Similarly, there are no languages that consist of a single, highly ambiguous word (sorry Hodor); such a language would be nearly useless for communication.