Azure Synapse Analytics

Step-by-Step Guide to Creating External Tables in Azure Synapse - Serverless SQL Pool

Azure Synapse Analytics allows you to create external tables using data stored in Azure Data Lake Storage. This guide will walk you through the steps to set up an external table, enabling you to query data directly from your data lake.

Step 1: Log in to the Azure Portal

Go to portal.azure.com and log in with your credentials.

Ensure you have the necessary permissions to create and manage resources in Azure Synapse Analytics and Data Lake Storage.

Step 2: Navigate to Azure Synapse Analytics

Search for "Azure Synapse Analytics" in the Azure portal.

From the Azure dashboard, use the search bar to find "Azure Synapse Analytics" and select your Synapse workspace.

Step 3: Open Synapse Studio

Click "Open Synapse Studio".

In the Synapse workspace, click on the "Open Synapse Studio" button to launch the integrated development environment (IDE) where you’ll manage your data lake and external tables.

Step 5: Create an External Table

Create an external table that references your data in Data Lake Storage.

Use the following SQL code in a new script in the "Develop" tab to create an external table:

   CREATE DATABASE final_db;
    use final_db;
    CREATE EXTERNAL FILE FORMAT [SynapseDelimitedTextFormat] 
        
        WITH ( 
            FORMAT_TYPE = DELIMITEDTEXT ,
               FORMAT_OPTIONS (
                 FIELD_TERMINATOR = ',',
                 FIRST_ROW = 2,
                 USE_TYPE_DEFAULT = FALSE
    ))
    
    CREATE EXTERNAL DATA SOURCE [mydatasource] 
         WITH (
            LOCATION = 'https://datalakestorageazure083.dfs.core.windows.net/data/' 
    )
   
    CREATE EXTERNAL TABLE tech_support (
        [PROBLEM_TYPE] nvarchar(4000),
        [no_of_cases] int,
        [Avg_pending_calls] float,
        [Avg_resol_time] float,
        [recurrence_freq] float,
        [Replace_percent] float,
        [In_warranty_percent] float,
        [Post_warranty_percent] float
        )
        WITH (
        LOCATION = '/technical_support_data.csv',
        DATA_SOURCE = [mydatasource],
        FILE_FORMAT = [SynapseDelimitedTextFormat]
    )
  
          

Ensure you replace the placeholders with your schema, table name, data source, and file format details. The data source is the linked service you created in the previous step.

Step 6: Query the External Table

Query the external table to verify data integration.

Once the external table is created, you can query it as if it were a regular table. Use a simple `SELECT` statement to retrieve data from the external table and validate that it's correctly connected to your data lake.

  SELECT TOP 100 * FROM tech_support;

Conclusion

By following this step-by-step guide, you have successfully created an external table in Azure Synapse Analytics using data from Azure Data Lake Storage. This setup allows you to seamlessly query large datasets stored in your data lake for analytics and reporting purposes.