Our data consultants fully immersed themselves in the future world of Amazon Web Services (AWS) during the AWS Summit 2023. During this inspiring day, promising developments were presented that we can expect shortly. We want to share the ones we see as having the most significant potential with you in this blog.

What else did we expect? Generative AI is all the rage at the moment, and let's be honest, its potential is tremendous. This is no different within AWS. Two AWS services make clever use of this:

1. GenerativeAI service Amazon Bedrock

AWS will provide Amazon Bedrock as a Generative AI tool as Software as a Service (SaaS). It allows you to build a customized solution similar to ChatGPT for your business. You can use it for chatbots, image generation, and personalization. The unique aspect of Bedrock is that you can train it on your data, which is only available within your AWS environment. This enables you to create precise solutions. Bedrock is only available in a limited preview, so stay tuned!

2. Automatic code completion with Amazon CodeWhisperer

Developers, prick up your ears! Amazon makes your life easier with CodeWhisperer. This service is designed to assist developers by automatically completing their code while working on it. It is trained on public code projects but considers the specific variables and parameters defined in your project. CodeWhisperer can help developers work faster and optimize their code. 

Zero Extract Transform Load (ETL) eliminates the need for the transformation step in the initial data processing. This approach directly extracts data from the source system and transfers it to an analytical database, allowing immediate querying. It has been possible for some time now to transfer data from S3 directly to Redshift using Redshift Spectrum. At the summit, a new integration between Aurora and Redshift was presented, further advancing the concept of Zero ETL. 

Zero ETL enables near real-time data availability in your analytical database by eliminating the computationally intensive transformation step. You can derive immediate insights from the data or apply machine learning algorithms. You can still perform and schedule the transformation step if necessary, but you no longer need to wait for the entire dataset to be processed..

AWS provides tools such as Amazon Step Functions and Amazon EventBridge that allow you to trigger data processing based on events. Since the process starts when an event occurs, you can execute it in near real-time. An event-driven architecture in AWS consists of four components:

  1. An event occurs: the most common scenario is a file landing in S3.

  2. This triggers an event in EventBridge.

  3. In Step Functions, a workflow is executed. This workflow consists of actions and decisions that transform, move, or enrich the data.

  4. The Simple Queue Service (SQS) connects various processes within Step Functions. 

A significant development is that Step Functions now has broad compatibility with almost all AWS services, expanding the possibilities for event-driven architecture.

Current data lakes often need help updating information, which can be costly. Consequently, data is frequently overwritten, leaving the old files intact. This poses compliance issues with regulations like GDPR, where deleting or anonymizing data (by the right to be forgotten) can be expensive. Open table formats such as Apache Iceberg and Apache Hudi are gaining popularity as they provide a solution to this problem. These formats allow for easy and cost-effective modification of files within the data lake. Technically they work with a meta-data layer on top of an Apache Parquet file, which makes this possible.

Apache Iceberg Metadata Technical Explanation.png

Amazon has made significant commitments to sustainability and is on schedule to become carbon neutral by 2040, ten years ahead of the Paris Agreement's target year. At this moment, 75% of their data centers are climate neutral. Besides, they plan to give clean water for their cooling systems back to the community in which the data center is located.

Are you getting as excited as us by these new cloud developments!? Or did this flood you with questions? We like to think along with you. You can always call or send us an email.