dataengineering

OLAP (Online Analytical Processing)

OLAP (Online Analytical Processing)

OLAP (Online Analytical Processing) is a technology that enables analysts to extract and query data interactively from multidimensional data warehouses. It provides a way to analyze complex datasets for decision-making, typically in business intelligence (BI) applications. Definition of OLAP OLAP is a system for organizing large business databases and supporting complex analysis. Unlike OLTP (Online Transaction Processing), which focuses on fast, real-time transactional operations, OLAP emphasizes analytical operations such as summarizing, aggregating, and comparing data across multiple dimensions. Core Concept of OLAP At its core, OLAP uses a multidimensional data model, often referred to as a "cube." This cube allows…
Read More
End-to-End AWS KMS Encryption and Decryption Tutorial

End-to-End AWS KMS Encryption and Decryption Tutorial

We're excited to share our new tutorial on Keyper. Keyper v0.0.3 now supports AWS (in addition to GCP) for end-to-end data and file encryption and decryption. Whether you're a data engineer, platform engineer, or security analyst, this guide will help you securely manage encryption keys and protect sensitive data in your AWS cloud environment using AWS IAM and KMS in three simple commands. ➡️ Go to the Keyper AWS tutorial now Why Use Keyper and AWS KMS for Data Security? Data security is increasingly important, and encryption is one of the most effective ways to defend against unauthorized access. Keyper…
Read More
Uploading Files Using Pre-Signed URLs to a Specific Storage Class

Uploading Files Using Pre-Signed URLs to a Specific Storage Class

Here’s a step-by-step guide on how to implement file uploads using pre-signed URLs to a specific storage class, specifically with AWS S3. I’ll cover how to generate a pre-signed URL in Python and how to use it in Postman. Architecture: Create an IAM User: Sign in to the AWS Management Console. Navigate to IAM (Identity and Access Management): Open the IAM Console. Create a New User: Click on Users in the sidebar. Click the Add user button. Enter a user name (e.g., s3-uploader). Select Programmatic access for the access type to generate an access key ID and secret access key.…
Read More
BigQuery Schema Generation Made Easier with PyPI’s bigquery-schema-generator

BigQuery Schema Generation Made Easier with PyPI’s bigquery-schema-generator

When importing data into BigQuery, a crucial step is defining the table's structure - its schema. This schema can be auto-detected or defined manually. Auto-Detection with BigQuery’s LoadJobConfig Method (for Smaller Datasets) When we load data from a CSV file, we use the LoadJobConfig method with the autodetect parameter set to True. This tells BigQuery's data importer (bq load) to peek at the first 500 records of your data to guess its schema. This works well for smaller datasets, especially if the data originates from a well-defined source like a pre-existing database. Manual Definition: Tedious for Large & Evolving Data…
Read More
FastAPI for Data Applications: From Concept to Creation. Part I

FastAPI for Data Applications: From Concept to Creation. Part I

In this blog post, we'll explore how to create an API using FastAPI, a modern Python framework designed for building APIs with high performance. We will create a simple API that allows users to add, update, and query items stored temporarily in memory. Alongside this, we'll discuss how you can extend this example to expose machine learning models, perform online processing in decision engines, and ensure best practices for a robust, secure API. Pre-requisites: Installation of FastAPI and Uvicorn Before diving into the code, we need to install FastAPI and Uvicorn, an ASGI server to run our application. Run the…
Read More
No widgets found. Go to Widget page and add the widget in Offcanvas Sidebar Widget Area.