The Significance of Smart Questions

25 Jan 2024

What’s the Big Deal?

Smart questions play such an important role for software engineers, guiding us towards innovative solutions alongside a deeper understanding of complex problems. People are more willing to help others with questions who demonstrate the willingness to learn and the willingness to put in the effort. The people answering our questions are mostly volunteers who take the time out of their lives to provide us with answers to our questions for free. With so many questions out there, it’s important to make sure that your question is worth answering. How a question is asked is often more important than how hard or difficult it is to answer a question!

The Question:

Q: Why is processing a sorted array faster than processing an unsorted array?

In this C++ code, sorting the data (before the timed region) makes the primary loop ~6x faster:

(Sorce Code)

Without std::sort(data, data + arraySize);, the code runs in 11.54 seconds.
With the sorted data, the code runs in 1.93 seconds.
(Sorting itself takes more time than this one pass over the array, so it's not actually worth doing if we needed to calculate this for an unknown array.)

Initially, I thought this might be just a language or compiler anomaly, so I tried Java:

(Additional Sorce Code)

With a similar but less extreme result.

My first thought was that sorting brings the data into the cache, but that's silly because the array was just generated.

What is going on?
Why is processing a sorted array faster than processing an unsorted array?
The code is summing up some independent terms, so the order should not matter.

This selected question exhibits several characteristics of a smart question. Firstly, the problem presented is specific. People who see it will immediately understand what exactly it is the author is requesting help with. Secondly, the author demonstrates their commitment to understanding the issue. They conducted experiments on their own, making sure to include their test code alongside the execution times for both a sorted array and an unsorted array. Additionally, the author shared initial hypotheses (compiler/language anomaly + data sorting relating to the cache) and further experimentations to test those hypotheses, avoiding assumptions and ruling out potential explanations.

The Answer

All the responses I read into were of high quality and great depth, a reflection of the quality of the question. While most of the responses are a bit too lengthy to include completely in this essay, they can all be viewed on stackoverflow where the question was asked here: Link to stackoverflow question

The first response starts off with the answer to the question with another question: “What is Branch Prediction?” The reason for the time discrepancy is due to branch prediction, but the responder didn’t stop there. They opened up with an analogy of branch prediction being like a railroad junction and you are the operator of the junction. It is explained as: “You stop the train to ask the driver which direction they want. And then you set the switch appropriately.” The problem with this is that trains take forever to start up and slow down, similar to modern processors! What branch prediction does is that it guesses which direction the branch will go through identifying a pattern and following it. This is why a sorted array (which is much more predictable) gets processed much faster than an unsorted array (which is much less if not unpredictable) rendering branch predictors essentially useless. The response additionally includes a quick visualization of branch prediction and goes into potential fixes when the compiler isn’t able to optimize branch prediction and provides the time for multiple different scenarios.

An additional response to this question mentions how this first response was ‘explained beautifully’.

Conclusion

I feel that the willingness to go to such lengths to not only answer the question, but also to provide additional explanations and information for the author’s understanding and alternative solution use were done so out of respect and acknowledgement of the author’s commitment to understanding/figuring out the question. The lengths one would go to in order to answer a question would be drastically lower should the question have been asked poorly, or the author did not demonstrate a willingness to learn. Their genuine curiosity and known efforts at trying to figuring this question out has had an impact on the quality of the responses they received.

Asking quality, smart questions results in qualtiy smart answers.