Business intelligence (BI) is the set of processes and technologies that enable you to transform your business data into the management information essential for informed decision-making. It represents an enormous competitive advantage, especially in times of recession. And our customers understand this!
As part of our BI team, we’ve had the chance to work on large-scale projects for a number of different customers, using a variety of tools to exploit the full potential of a company’s data, whatever its area of expertise. In fact, we’ve used Azure Synapse on several occasions. After several years of use and successive deployments of versions, we’d like to share our point of view and give you a genuine opinion on this famous Cloud platform developed by Microsoft.
What is Synapse?
First of all, let’s take a brief look at Synapse. It’s a data analysis tool that combines various resources, in particular storage and computing resources. Basically, it allows us to exploit our customers’ data, consolidate it, analyse it and visualise it, all within an agile and scalable environment.
The wide variety of resources and the scalability of the tool thanks to its cloud hosting mean that it can appeal to a wide range of audiences and budgets. So we’ve explored its features in great detail to give you some useful and comprehensive feedback.
The Synapse Features that Won Over Our BI Team
Azure Synapse offers a number of features that we really appreciated:
1. Parametrisation
Almost every element in Synapse (linked services, integration runtimes, pipelines, data flows, triggers, etc.) can be parameterised using Dynamic Content. This offers a great deal of development flexibility, particularly when it comes to carrying out repetitive tasks. For example, it’s common for our projects to use JSON parameter files. In this way, for example, we avoid having twenty similar pipelines ingesting the same data source and having just one that automatically modulates itself according to the parameters passed to it.
2. Data Logging
For those who do not want to use traditional data historisation methods such as SCD (Slowly Changing Dimensions), which involve the use of SQL databases that can generate considerable costs, Synapse also integrates the Delta data format, which is based on Apache Parquet technology (an open source technology developed by Databricks) enabling, among other things, text files to be historised using ‘time travel’ capability.
3. Volume of Data
Azure Synapse is capable of managing massive volumes of both structured and unstructured data. Thanks to the Cloud, it can elastically scale storage and perform calculations independently. Several options exist for data storage, including Azure DataLake storage Gen2, its DataLake platform, and should be considered for any project exceeding one terabyte of data. For the latter, Synapse uses PolyBase, which offers unprecedented performance.
4. Integration with Azure Services
The Synapse platform integrates seamlessly with other Azure services. What’s more, it integrates Spark (a nod to our partner Databricks). However, beware of billing surprises, as using Spark on Synapse seems to be much more expensive than on Databricks!
5. Parallelization
The databases in Azure Synapse are based on massively parallel processing (MPP) technology, which enables it to manage analytical workloads and efficiently aggregate and process large volumes of data. Parallel database engines also distribute data across multiple nodes that run in parallel to process different parts of queries. This database architecture facilitates complex, long-term analytical processes. What’s more, the addition of Spark opens up just as many possibilities for processing massive data, particularly in the field of artificial intelligence.
6. Connectors
A multitude of connectors are available, allowing you to natively integrate data from the most popular data sources, including SQL, MySQL, SAP, Google Sheets, HubSpot, REST and Shopify, to name but a few. You can always search for other connectors in a catalogue shared between users, or even create your own if one doesn’t exist.
Three Suggestions for Improving Azure Synapse
Despite the undeniable strengths of the Synapse tool, there are other aspects that we like less and which we would like to see improved in future versions for a smoother experience:
1. Environment Variables
Synapse doesn’t offer environment variable management, which poses difficulties in terms of version integration and project reuse. We have to manage it ”by hand”, which leads to greater complexity in pipelines by using a JSON versioning system for files.
2. Complexity of the CICD
Continuous integration, continuous delivery (CICD) can quickly become complex, not least because the default task available in DevOps is limited. What’s more, certain elements such as Spark pools and integration runtimes require particular attention when moving from one environment to another.
3. Sharing Integration Runtimes
Unlike Data Factory, Synapse does not support the sharing of integration runtimes, which poses problems at various levels, particularly with regard to CICD.
Conclusion
We are unanimously satisfied with the use of Synapse and we hope that Microsoft will continue to improve it. We can’t wait to see what’s next!
Alongside our experience with Azure Synapse, we would also like to highlight the growing interest in Microsoft Fabric. This new Microsoft solution offers advanced features for the development and deployment of large-scale distributed applications. large-scale distributed applications. Microsoft also plans to offer migration paths enabling projects to be transitioned directly from Synapse to Fabric in the near future. As we explore the possibilities offered by Microsoft Fabric, we are convinced that this tool can complement and strengthen the existing ecosystem, including integration with Azure Synapse.
By combining Synapse’s capabilities with those of Microsoft Fabric, BI teams could benefit from a more complete and adaptable solution for their data processing and analysis needs. We’ll be keeping a close eye on developments with Microsoft Fabric and look forward to using it for future projects! Stay tuned, we’ll be telling you all about it very soon. 😇