Category Whitepapers and Guides
[vc_row][vc_column width=”1/1″][vc_column_text]A decision that many engineers face at some point of their career is deciding what to focus their attention on next. One of the amazing advantages of working in a consultancy is being exposed to many different technologies, providing you the opportunity to explore any emerging trends you might be interested in. I’ve been lucky enough to work with a huge variety of clients ranging from industry leaders in the FTSE 100 to smaller start-ups disrupting the same technology space.
So why did I pick Big Data?
A common pattern I’ve noticed is that everyone has access to data – large amounts of raw, unstructured data. Business and technology leaders all recognise the importance of it, and the value and insight that it can deliver. Processes have been established to extract, transform and store this large amount of information, but the architecture is usually inefficient and incomplete.
Years ago these steps may have equated to the definition of an efficient data pipeline but now with emerging technologies such as Kinesis Streams, Redshift and even Server-less databases there is another way. We now have the possibility of having a real-time, cost efficient and low operational overhead solution.
Alongside this, companies set their sights on creating a data lake in the cloud. In doing so, they take advantage of a whole suite of technologies to store information in formats that they currently leverage and also in a configuration they possibly may harness in the future. These are all clear steps in the journey towards digital transformation, and with the current pace of development in AWS technologies it is the perfect time to become more acquainted with Big Data.
But why is the certification necessary?
The AWS Certified Big Data Speciality exam introduces and validates several key big data fundamentals. The exam itself is not just limited to AWS specific technologies but also explores the big data community. Taken straight from the exam guide we can see that the domains cover:
These domains involve a broad range of technical roles ranging from data engineers and data scientists to individuals in SecOps. Personally, I’ve had some exposure to collection and storage of data but much less with regards to visualisation and security. You certainly have to be comfortable with wearing many different hats when tackling this exam as it tests not only your technical understanding of the solutions but also the business value created from the implementation. It’s equally important to consider the costs involved including any forecasts as the solution scales.
Having already completed several associate exams I found this certification much greater in difficulty because you are required to deep dive into Big Data concepts and the relevant technologies. One of the benefits of this certification is that the scope extends to these technologies’ application of Big Data so be prepared to dive into Machine Learning and popular frameworks like Spark & Presto.
Okay so how do I pass the exam?
1. A Cloud Guru’s certified big data specialty course provides an excellent introduction and overview.
2. Have some practical experience of Big data in AWS, theoretical knowledge is not enough to pass this exam…
3. Understand the different storage options on AWS – S3, DynamoDB, RDS, Redshift, HDFS vs EMRFS, HBase…
4. Understand the differences and use cases of popular Big Data frameworks e.g. Presto, Hive, Spark.
5. Data Security contributes the most to your overall exam score at 20% and is involved in every single AWS service. There are always options for making the solution more secure and sometimes they’re enabled by default.
6. Performance is a key trend
7. Dive into Machine learning (ML)
8. Dive into Visualisation
It can’t be emphasised enough that AWS themselves provide amazing resources for learning. Definitely as preparation for the exam watch re:Invent videos and read AWS blogs & case studies.
Watch these videos:
Read these AWS blogs:
All of the Big Data services developer guides.
One last note….
This exam will expect you to consider the question from many different perspectives. You’ll need to think about not just the technical feasibility of the solution presented but also the business value that can be created. The majority of questions are scenario specific and often there is more than one valid answer, look for subtle clues to determine which solution is more ‘correct’ than the others, e.g. whether speed is a factor or if the question expects you to answer from a cost perspective.
Finally, this exam is very long (3 hours) and requires a lot of reading. I found that the time given was more than enough but remember to pace yourself otherwise you can get burned out quite easily.
Hopefully my experience and tips will have helped in preparation for the exam. Let us know if they helped you.
Visit our services to explore how we enable organisations to transform their internal cultures, to make it easier for teams to collaborate, and adopt practices such as Continuous Integration, Continuous Delivery, and Continuous Testing. [/vc_column_text][/vc_column][/vc_row]