Azure Synapse Analytics
Go to portal.azure.com and log in with your credentials.
Ensure you have the necessary permissions to create and manage resources in Azure Synapse Analytics and Data Lake Storage.
Search for "Azure Synapse Analytics" in the Azure portal.
From the Azure dashboard, use the search bar to find "Azure Synapse Analytics" and select your Synapse workspace.
Click "Open Synapse Studio".
In the Synapse workspace, click on the "Open Synapse Studio" button to launch the integrated development environment (IDE) where you’ll manage your data lake and external tables.
Create an external table that references your data in Data Lake Storage.
Use the following SQL code in a new script in the "Develop" tab to create an external table:
CREATE DATABASE final_db; use final_db; CREATE EXTERNAL FILE FORMAT [SynapseDelimitedTextFormat] WITH ( FORMAT_TYPE = DELIMITEDTEXT , FORMAT_OPTIONS ( FIELD_TERMINATOR = ',', FIRST_ROW = 2, USE_TYPE_DEFAULT = FALSE )) CREATE EXTERNAL DATA SOURCE [mydatasource] WITH ( LOCATION = 'https://datalakestorageazure083.dfs.core.windows.net/data/' ) CREATE EXTERNAL TABLE tech_support ( [PROBLEM_TYPE] nvarchar(4000), [no_of_cases] int, [Avg_pending_calls] float, [Avg_resol_time] float, [recurrence_freq] float, [Replace_percent] float, [In_warranty_percent] float, [Post_warranty_percent] float ) WITH ( LOCATION = '/technical_support_data.csv', DATA_SOURCE = [mydatasource], FILE_FORMAT = [SynapseDelimitedTextFormat] )
Ensure you replace the placeholders with your schema, table name, data source, and file format details. The data source is the linked service you created in the previous step.
Query the external table to verify data integration.
Once the external table is created, you can query it as if it were a regular table. Use a simple `SELECT` statement to retrieve data from the external table and validate that it's correctly connected to your data lake.
SELECT TOP 100 * FROM tech_support;
By following this step-by-step guide, you have successfully created an external table in Azure Synapse Analytics using data from Azure Data Lake Storage. This setup allows you to seamlessly query large datasets stored in your data lake for analytics and reporting purposes.