Uber is a company that changed inner city travel and made the lives of millions of users much easier. Their main product is a smartphone app that lets users find a ride for a reasonable price. Uber is involved in a number of other businesses, such as food delivery, shipments, and more.
All of these operations have one thing in common: they rely on stable and efficient logistics to successfully complete the trip. To reach that goal, Uber collects massive amounts of data every day. The company maintains a team of talented data scientists to gain actionable insights from the available information.
In this article, we will explore one of the SQL interview questions asked during an interview at Uber.
Basic to Intermediate Concepts Tested in Uber SQL Interview
Before going into an interview, read this “Ultimate Guide to SQL Aggregate Functions” to refresh your knowledge of aggregate functions. You will need to know aggregate functions to solve most SQL questions asked during interviews, and more importantly, write SQL queries on your day to day job.
You need to have a basic knowledge of aggregate functions, such as the syntax, arguments and how to summarize the results using the GROUP BY statement.
If you want to really stand out from the competition, it’s a good idea to dig a little deeper. For instance, learn how to use aggregate functions in combination with other SQL features, such as the DISTINCT statement.
SUM() is one of the most commonly used aggregate functions. It is necessary to solve questions where you have to add up numerical values in a certain column. In the question below, we will use it to calculate total miles.
You should know that SUM() can only be used for counting the total of numbers, not with text, date or other types of values. Knowing little details, like how this function handles NULL values, can go a long way.
The WHERE statement is one of the basic features of SQL. Despite its simplicity, it is essential for filtering data in SQL. You should know how to use it with SELECT/FROM statements to filter out rows that don’t meet the criteria.
To filter the tables in SQL, you should know how to set a condition for the WHERE statement. There is no limitation on what the condition can be. You might have to check if the number value in one column is higher than another. You should be prepared for how to use comparison operators to compare other values, such as strings, dates and floats.
There are syntax rules for writing conditions. For instance, whether it’s necessary to use quotes when comparing number, string, date and other types of values.
It’s important to know how comparison operators work on non-numerical data. For instance, if you’re working with sales data, you might want to limit the output to one specific month. You can use comparison operators to get the records from one specific month, or determine which of the two dates came first.
Interviewers often ask you to return X number of highest values in a column. To do that, you need to arrange the data from the highest to lowest, and output the specified number of values in the order.
We use the ORDER BY statement to sort rows in SQL. It allows you to specify a column, and SQL will order the records based on values in that column. To solve some of the more difficult questions, you might have to specify more than one column. It’s useful to know how to specify the column by name or by its index in the table.
A good candidate should know how to specify between ascending and descending order for sorting values, and how the ordering works on date and string values.
Also, the knowledge of details, such as what happens when the order is not specified, can go a long way.
Ranking Window Functions
As previously mentioned, to solve some questions, candidates have to output a specific number of highest values in a column. There are multiple ways to approach these questions.
In some cases, like for the question below, it can be argued that a simple LIMIT statement is enough to output a specific number of values off the top.
On the other hand, ranking window functions are always the most accurate and foolproof method to output a certain number of highest values.
Functions like RANK() and DENSE_RANK() allow you to assign ranking to rows based on the value in one of the columns. It’s important to understand the difference between these two functions.