Required Skills/Knowledge To Pass AWS Machine Learning Specialty Certification
This article provides details on the skills or knowledge required to successfully pass the AWS ML specialty exam.
Join the DZone community and get the full member experience.
Join For FreeDisclaimer: All the views and opinions expressed in the blog belong solely to the author and not necessarily to the author's employer or any other group or individual. This article is not a promotion for any course or training platform. The sole objective of this article is to help the AWS community to successfully pass this difficult exam. Also, this article is based on my exam experience, which may differ from any other individual's exam experience.
I passed the AWS Certified Machine Learning — Specialty exam in November 2022. With this article, I would like to share my experience and the preparations I took to pass this certification exam. I don't want to share the details that you can get from the AWS certification page. Rather, I would share the topics that you would need to know to pass the exam and the type of questions that you can expect during the exam.
The Courses and Practice Exams That I Took for This Exam
- I had the AWS virtual classroom training on Machine Learning which is very similar to the self-paced online free digital training provided by AWS Skill Builder(free content). Both the courses are good as a starting point, but in my opinion, neither of them is sufficient to pass the exam.
- I took the AWS Certified Machine Learning Specialty 2022 — Hands On! by Stephane Maarek and Frank Kane on Udemy. I went through mostly Exploratory Data Analysis and Modeling topics which are closely related to Data Science since AWS-specific topics are well-known to me.
- I took several practice exams. The explanations cleared a lot of doubts, especially around the Data Science domain. Please don't expect to see the same questions in the exam, but these practice exams will clear a lot of doubts. Below is the list of practice exams that I took. You may not need to take all of them, but I highly recommend taking at least one before the exam and assessing your knowledge.
- AWS Certified Machine Learning Specialty: 3 PRACTICE EXAMS by Abhishek Singh
- AWS Certified Machine Learning Specialty Practice Exams by Jon Bonso
- AWS Certified Machine Learning Specialty Full Practice Exam by Frank Kane
- I have prepared a personal note from the courses and the practice exams that I used to refer to a lot. I highly recommend creating a simple note that you can use to capture the details that are not very well known to you and something that you can refer to anytime, especially before the exam.
Required Skills/Knowledge To Pass the Exam
Domain 1: Data Engineering
S3
- You need to have basic S3 knowledge as S3 serves as the primary data source for AI/ML workload in AWS.
- Understand the S3 Storage classes and how Lifecycle rules can move objects between different S3 Storage classes to save cost and S3 security (bucket policy, different encryption mechanisms, and S3 VPC endpoint). Expect questions on S3 Storage classes and Lifecycle rules.
Other Storage Services
- Understand that you can use EFS, EBS, and FSx for Lustre(fastest but costliest) as the storage solution for model training. Understand which option to use in which scenario. Expect questions like "customer needs the fastest solution to complete the training job."
Kinesis Family
- Understand the differences between Kinesis Stream vs. Kinesis Firehose. Look for keywords like real-time vs. near real-time, data transformation, serverless, automatic scaling, and data retention. Know the different producers and consumers for both options.
- High-level understanding of Kinesis Video Stream service and its use cases. How it's integrated with AWS DeepLens, SageMaker, and Recognition Video.
- Kinesis Data Analytics and its usage. Know that you can use
RANDOM_CUT_FOREST
for anomaly detection on your streaming data.
Glue, EMR & Athena
- Understand when to use Glue vs. EMR. Consider Glue if you see keywords like "less management" or "serverless" in the question. But consider EMR if you are dealing with an extremely large dataset or if you want to use other Bigdata frameworks than Apache Spark, or if you need low-level control of your compute layer. Also, understand the usage of Glue Catalog and Crawler. Know that Glue has a "FindMatches ML" transformation that identifies duplicate or matching records in your dataset.
- Understand how Athena works and what problem it solves. Consider Athena if you see keywords like "Adhoc analysis," "serverless," "SQL query on unstructured data in S3," etc.
- Have a high-level understanding of Step Function, AWS Batch, Data Pipeline, and Data Sync services. You might see these service names pop up in the question or answer.
Domain 2: Exploratory Data Analysis
Sanitize and prepare data for modeling
- Expect questions on imputing missing data. Have a good understanding of different data imputation techniques(e.g., Dropping vs. Mean replacement vs. ML Vs. SMOTE). Understand how removing stop words, tokenizing and “lowercase-ing,” HTML tag removal, Stemming, and Lemmatization help preprocess the data before training.
- Have a very good understanding of oversampling vs. undersampling. Expect a lot of questions where you need to identify if the dataset is unbalanced. What techniques you can apply when you have an unbalanced dataset?
- Understand why scaling, normalization, shuffling, and standardization are needed. Consider normalization when you want to scale the features to comparable values. Otherwise, features with larger magnitudes will have more weight than they should (age vs. salary). Whereas standardization is a scaling technique that reduces the effect of the outliers in the features
- Understand why labeling the dataset is important and how SageMaker Ground Truth and Mechanical Turk can help achieve that.
- Understand different data distribution functions ( normal distribution vs. Poisson Distribution vs. Binomial Distribution vs. Bernoulli Distribution). Understand the difference between Probability Mass Function and Probability Density Function.
- Perform feature engineering
- Expect questions on binning, outliers,1 hot encoding, and reducing the dimensionality of data. Be sure to know binning vs. Quantile binning, how to detect and minimize the impact of outliers, when to use 1 hot encoding, and why/when/how to reduce the dimensionality of the dataset. Know that Logarithm transformation and Robust Standardization address the outliers in data, among other techniques.
- Analyze and visualize data for machine learning
- Different graphing techniques(scatter plot, histogram, box plot, elbow plot) and when to use what. Know that scatter plots can tell how two features and related. Whereas a histogram provides the distribution of individual features.
- Understand that Amazon Quicksight provides data visualization, what are the different visualization types, and data sources. Know that out of the box, Quicksight provides anomaly detection, forecasting, and auto-narratives.
Domain 3: Modeling
Frame business problems as machine learning problems
- You might see a question where you need to determine if ML is a better solution than traditional programming to solve the business problem. ML can be a better choice when you dealing with either scale or cannot code the rules ( e.g., email spam or fraudulent credit card transactions scenarios)
- Have a very good understanding of Supervised vs. Unsupervised vs. Reinforcement ML algorithms. Also, given a problem, you should be able to identify if this is a Classification (Binary vs. Multiclass) or Regression type of problem. There will be questions that will be related to forecasting, clustering, and recommendation engine also.
Select the appropriate model(s) for a given machine-learning problem
- You need to know all of the out-of-box SageMaker algorithms and should know which one to use in which scenario. A few very important ones are Linear learner, XGBoost, KNN, K-means, DeepAR, Seq2Seq, Object2Vec, Word2vec, BlazingText, Object Detection, Image Classification, Semantic Segmentation, Factorization Machines, Random Forest, LDA, and PCA. You will be asked to choose the ML algorithm based on the business problem in multiple questions. Understand the difference between Logistic Regression and Linear Regression.
- Also, know that there are other options to create your own algorithm than the above built-in SageMaker algorithms using Apache Spark, custom code with TensorFlow or Apache MXNet, your own custom algorithm and code in Docker image or algorithm from AWS Marketplace. Have an understanding of when to use which option. Think in terms of time & effort, cost, and management of your ML solution.
Train machine learning models
- Understand how you can split the dataset into training, validation, and test sets (typically 80-10-10). Also, understand different cross-validation techniques, such as k-fold cross-validation.
- Understand what is an optimizer, gradient descent, loss functions, local minima, convergence, batches, and probability. Expect questions on gradient descent, loss functions, local minima, and model convergence in the exam.
- Know the different computing choices available on SageMaker. Typically GPU is preferred during training for deep learning algorithms, but non-GPU instances are good for inference because non-GPU instances are cost-effective and inference usually is less demanding. Also, know that using Elastic Inference can speed up the throughput and decreases the latency of getting real-time inferences from your deep learning models at fraction of the cost of using a GPU instance for inference.
- Understand how the batch size and learning rate impact the model training. Large batch sizes lead to faster training, but large batch sizes run the risk of getting stuck in local minima. Similarly, if the learning rate is too high, it can overshoot the correct solution, and the loss function will oscillate.
Perform hyperparameter optimization
- Regularization techniques such as dropout, Early Stopping, and L1/L2 are very important. These are used to prevent overfitting. Know the difference between L1 and L2.
- Have an understanding of RNN vs. CNN. Know that CNNs are ideal for image and video analysis applications, object detection, computer vision applications, or multi-dimensional data in general. Whereas RNNs are used for applications where the predicted value depends on previously seen values, time-series forecasting, speech recognition, and modeling sequence data. Know what is Long short-term memory (LSTM).
- Know different kinds of activation functions and their use cases. You need to know the difference between the sigmoid activation function vs. softmax function vs. Tanh and RELU. Expect questions on activation functions.
Evaluate machine learning models
- You need to know how you can detect & handle bias and prevent the model to be overfitting or underfitting. There will be many questions on this.
- Typically your model is underfitting when the model performs poorly on the training data. Adding more features to the model or removing regularization can help. You can increase the model complexity, train the model longer (more epochs), and use a different network architecture. But adding more training data may or may not help
- Your model is overfitting when you see that the model performs well on the training data but does not perform well on the evaluation data. This is because the model is memorizing the data it has seen and is unable to generalize to unseen examples. Regularization techniques can prevent the overfitting of your model.
- You need to know the binary and multi-class Confusion matrix. Expect questions on evaluation metrics (AUC-ROC, accuracy, precision, recall, RMSE, F1 score). Be sure to understand the use cases and memorize the calculation for accuracy, precision, recall, and F1 score.
- Understand how you can evaluate your model and perform testing (using Variant Weights for A/B testing, Blue-Green) using SageMaker. Understand how you can use different metrics for evaluation.
Domain 4: Machine Learning Implementation and Operations
Build machine learning solutions for performance, availability, scalability, resiliency, and fault tolerance
- Understand how you can use CloudWatch logs, metrics, and events to monitor your ML workload. You can set up a scaling policy to define target metrics, min/max capacity, and cooldown periods using CloudWatch. AWS CloudTrail provides a record of actions taken by a user, role, or an AWS service in Amazon SageMaker, but it does not monitor calls to InvokeEndpoint.
- Understand how you can use SageMaker for multi-AZ or multi-region deployment. Know that SageMaker automatically attempts to distribute instances across availability zones(if multiple instances are specified). Understand how you can use custom AMI. How Sagemaker VMs can run in an auto-scaling group behind a load balancer.
- Understand the basics of SageMaker rightsizing for instances (Spot vs. On-demand) and how you can select optimal volume and IOPS according to your workload.
- Understand how pipe input mode can help your training jobs to start sooner, finish quicker, and use less disk space when using large files from S3 to train your model.
Recommend and implement the appropriate machine learning services and features for a given problem
- Have a high-level understanding of the following AWS AI services. You should have some understanding of all the AI services mentioned in the exam guide, but based on my exam experience, I would highly recommend the below ones.
- Amazon Comprehend: Performs NLP and extracts key phrases, entities, and sentiment. You can train with your own data.
- Amazon Translate: Language translation service that supports custom terminology.
- Amazon Transcribe: Speech-to-text service that provides speaker identification, automatic language identification, and custom vocabulary.
- Amazon Polly: Text-to-speech service that provides Lexicons(Custom pronunciation of words & phrases), and supports SSML(Speech Synthesis Markup Language).
- Amazon Lex: Natural-language chatbot engine that is built around intent.
- Amazon Rekognition: Can be used for image & video analysis. Can identify faces and text in the image. You can use your own labeled data to identify unique items.
- Amazon Personalize: Recommendation service. Know the difference between GetRecommendations and GetPersonalizedRanking APIs.
- AWS DeepLens: Deep learning-enabled video camera, integrated with Kinesis Video Streams, Rekognition, SageMaker, Polly, Tensorflow, MXNet, and Caffe. Understand that it is not practical to use AWS DeepLens as a surveillance camera.
- Look for keywords such as "less management," "business needs a quick solution," "serverless," etc. Make your judgment based on the question of whether you should train & deploy the model in SageMaker or you should choose the easy-to-use AI services. The same applies to your own model vs. SageMaker's built-in algorithms.
- Understand how you can minimize the infrastructure cost using Spot instances and lower-cost instance types. Understand that GPU instances are really costly. You can also use Spot instance for model training and use checkpoints to S3, but it can increase the training time. Also, have a high-level understanding of AWS Batch service on how it can schedule compute instances to perform a batch job.
- Have a high-level understanding of the following AWS AI services. You should have some understanding of all the AI services mentioned in the exam guide, but based on my exam experience, I would highly recommend the below ones.
- Apply basic AWS security practices to machine learning solutions
- Have a good understanding of SageMaker security. How it integrates with IAM, CloudWatch, CloudTrail, and VPC. Understand how the data can be encrypted at rest and in transit.
- Understand how you can use S3 bucket policies to restrict access. Understand the different encryption options with S3(SSE-S3 vs. SSE-KMS vs. SSE-C vs. CSE). Understand how you can use S3 and other VPC endpoints to access files in S3 and other services when you don't have egress internet access from your SageMaker instances.
- Deploy and operationalize machine learning solutions
- You need to know how you can deploy the model and interact with them. Once you train the model and it's ready for production deployment, you need to create an endpoint configuration and then create the endpoint to deploy your model. The service automatically launches the number of ML compute instances and places them in different AZs(if you specify two or more instances).
- Know that SageMaker also supports auto-scaling for product variants. Auto-scaling works with CloudWatch, which can dynamically adjust the number of compute instances based on the load.
- Understand what is Inference Pipeline. Know that it's a linear sequence of 2-15 containers where you can combine pre-processing, predictions, and post-processing. It can be used for both real-time and batch predictions. Within an inference pipeline model, Amazon SageMaker handles invocations as a sequence of HTTP requests.
- Understand how you can monitor the performance of your model and the different options that you can use to debug your model issues in production.
Opinions expressed by DZone contributors are their own.
Comments