Hive Query Execution
2 min readJan 21, 2023
The steps involved to execute the HQL statements
- Execute Query : Interface to execute Hive Queries such as Web UI, Hive CLI which communicate with the hive Driver to execute the HQL statements.
- Get Plan : The driver accepts the query, creates a session and passes the qyery to the Hive compiler for getting the execution plan.
- Get Metadata : The Hive compiler sends the metadata request to the Hive Metastore
- Send Metadata : The metastore send the metadata to the compiler.The compiler uses this metadata for performing type-checking and semantic analysis on the expressions in the query tree. The compiler then generates the execution plan (Directed acyclic Graph)
- Send Plan : The compiler then sends the generated execution plan to the driver.
- Execution Plan : After receiving the execution plan from compiler, driver sends the execution plan to the execution engine for executing the plan.
- Submit Job : The execution engine then sends these stages of DAG to appropriate components. For each task, either mapper or reducer, the deserializer associated with a table or intermediate output is used in order to read the rows from HDFS files.
Once the output gets generated, it is then written to the HDFS temporary file through the serializer. These temporary HDFS files are then used to provide data to the subsequent mapreduce stages of the plan. For DML operations, the final temporary file is then moved to the table’s location. - (8) and (9) Send Result : Now for queries, the execution engine reads the contents of the temporary files directly from HDFS as part of a fetch call from the driver. The driver then sends results to the Hive interface.
Thank you for reading :).
Connect with me on Linkedin : Naveen Pn