Exam DP-203: Data Engineering on Microsoft Azure (beta)
Update 5/12/21: Did successfully pass the 203 beta exam.
This is a brief summary of my experience taking
Exam DP-203: Data Engineering on Microsoft Azure (beta) Note it’s still in beta so the information on this may change as Microsoft updates the exam. For more information on taking beta exams check out my post “Why Take Microsoft beta Exams“
Overall thought it was an okay exam. At times felt some of questions were too easy. Had some questions around basic T-SQL query construction and basic Azure fundamentals in terms of storage types. On the other end of the spectrum some of the case studies were a little convoluted and not as straightforward.
If one were to look to take the DP 203 I’d recommend brushing up on:
Azure Synapse
- What it is, intended audience
- How to setup and read data externally via Polybase
- Cost Management
- When and how to increase performance
Data Modeling
- Slowly changing dimensions
- Type 0, 1, 2
- Denormalized vs Normalized
- Difference between the two
- When to use which
- Partitioning, partitioning, partitioning. A lot on this
- Columnstore indexes
- Hash vs Round robin distribution
Scripting Languages
- SQL
- R
Stream Analytics
- Windowing functions….lots on this
- Tumbling
- Sliding
- Hoping
- Session
- Snapshot
- When and how to decrease latency
- What it can/can’t do
Data Factory
- Triggers/Schedules
- Integration with Azure Synapse
- Optimize data for batch processing
- Best types of file formats for various activities
- Types of Data Movements
- Mapping
- External
- Copy
- Integration Runtimes, when to use each
- Azure
- self-hosted
- SSIS
Storage Accounts
- Cost tiers
- Hot
- Cold
- Archive
- Redundancy
- Local
- Zone
- Geo read only
- Geo
- How to read from Data Factory, Synapse, Databricks
Hi John,
Congrats on passing the DP-203 exam. Can I know how many case study questions were there and is it mandatory to have R language skills. I worked on Python but not R programming. Thanks.
Best Regards
Suresh
Well…I don’t know if I passed yet as beta exams can take a while for the grade to be finalized.
As for case studies I would say it was the standard Azure exams 2-4 case studies. Really wasn’t much on R and what would have been on would most likely follow some of the other Microsoft Exam questions in terms of select appropriate dropdown to complete the command. Definitely wouldn’t let a lack of R experience hold you back from taking the exam.
Hope that helps!
I know SQL on SQL server quite well. I am a beginner in python. Are there are a lot of or any python based questions?
Thank you in advance. I appreciate your help.
I don’t recall a ton of Python being on the exam. Full disclosure, since I took it while still in beta it may have changed and I encourage you to check out the latest exam outlines. That being said, emphasis on SQL was some T SQL and more so on best practice in terms of indexing and data modeling. Also be sure to know capabilities of various Cloud Data tools. i.e. Data Factory and it’s integration engine types, Azure Storage Accounts features, and what you can do in Azure Synapse is a big topic too. If going in knowing just SQL you might struggle with the additional data engineering technologies.
Data engineering applications can be used to build applications that handle real-time financial transactions. These applications can be designed to manage complex queries that access a considerable amount of data, including trending charts and detailed historical data. They can also be built to deal with complex business practices, such as complex multi-user scenarios, real-time order execution, or real-time regulatory compliance. As a developer, you will need to understand these different scenarios and how they relate to your company’s data management needs.
Thanks…I do dp-203 training and you have a better memory than myself from the exam 🙂