SQL Error – “SELECT TOP 100” throws error in SparkSQL – what’s the correct syntax?
In SQL Server to get top-n rows from a table or dataset you just have to use “SELECT TOP” clause by specifying the number of rows you want to return, like in the below query.
But when I tried to use the same query in Spark SQL I got a syntax error, which meant that the TOP clause is not supported with SELECT statement.
%sql Select TOP 100 * from SalesOrder
Error in SQL statement: ParseException:
com.databricks.backend.common.rpc.DatabricksExceptions$SQLExecutionException: org.apache.spark.sql.catalyst.parser.ParseException:
mismatched input ‘100’ expecting (line 1, pos 11)== SQL ==
Select top 100 * from SalesOrder
———–^^^
As Spark SQL does not support TOP clause thus I tried to use the syntax of MySQL which is the “LIMIT” clause.
So I just removed “TOP 100” from the SELECT query and tried adding “LIMIT 100” clause at the end, it worked and gave expected results !!!
%sql Select * from SalesOrder LIMIT 100