With Azure Data Factory (ADF), Microsoft has transferred the features of a classic data warehouse to the cloud. The newer service Azure Synapse Analytics (Azure Synapse) goes a step further and combines data warehousing with big data and data science. An example with data from a standard fitness tracker, Xiaomi Mi Fit INTELLI-Band 4, shows how the cloud service can be easily used for code-free data integration and analysis.
Use cases from practice
Azure Synapse is basically built on the same data integration engine and offers the same work environments as ADF, but lacks features around integration runtime and Power Query. Spark pools and data flow monitoring, on the other hand, are only available in Azure Synapse. The following practical example uses the Azure Synapse Batch ETL Pipeline including the serverless SQL resource models.
The data to be analyzed comes from the fitness tracker from Xiaomi and includes daily activity (number of steps and distance traveled per day) and sleep activity (deep sleep and light sleep). Although the Mi-Fit app on the one hand also provides statistics and on the other hand suitable data for the analysis can be found in publicly available sources on the internet, the author as an experimental data engineer prefers to play with his own data.
The data can be uploaded as two CSV files to Azure Blob Storage, where the Azure Synapse Pipeline extracts, transforms and merges them into a single CSV file. The data in the serverless Azure Synapse SQL pool is then analyzed (Figure 1).
The amount of data is manageable, which is an advantage for the serverless Azure Synapse SQL pool because it is only billed after use. The details of a practical example at the end of the article show what costs may actually arise.
Owners of a fitness tracker have the right to export their own data from the device – not least thanks to the GDPR. With the tracker used here, the process is not that simple and there is little useful information on the internet.
The abbreviated overview of the necessary steps for this fitness tracker on Android 9, 10 or 11 is:
- In the app at the bottom of the bar profile pressure
- Scroll down and settings choose
- In the settings of over go and go back to the point exercise user rights roll
- In point exercise user rights scroll down again and transfer data choose
- In point transfer data you can finally select all the required data (see fig. 2) and have it sent to an e-mail address
The fitness tracker first sends the data via the app to the exchange server, which sends an email with the download link and password within minutes. A password-protected ZIP file is available at the attached link. After the file is downloaded and extracted, the folder structure looks like Figure 3. For the example in this article, only the ACTIVITY.csv and SLEEP.csv files are relevant. Figure 4 shows the structure of this data. After preparing Azure Synapse Analytics, both files can be uploaded to the Azure cloud.
By the way, according to an article on Medium, owners of an Apple Watch can access their data and an appropriate ZIP file in just three steps. The ZIP archive then contains a large XML file that should only be parsed for processing in Azure Synapse.