Posts

Showing posts from 2023

My Research papar Published in IEEE

Image
 Hi All, Just wanted to share my research paper that was published back in Oct 2021. This was about the Web based app that calculates Nitrogen concentration of Rice leaf using Leaf Color Chart. The approach was to calculate Average RGB and get the HEX Color code for all shades in LCC(:Leaf Color Chart). Leaf color chart is a tool used to get the nitrogen conc. The web/android app that i made was to digitize the LCC. You can read more on LCC here:  https://iiss.icar.gov.in/eMagazine/v3i1/9.pdf Here is the paper in IEEE Explore website. https://ieeexplore.ieee.org/document/9555875/ Thanks

Generate PySpark Schema dynamically in Python from JSON Sample

 Hi Folks, If you need to genarate pyspark schema from JSON you can always my tool here  https://preetranjan.github.io/pyspark-schema-generator/ but if you need to do it in Python then here is the code snippet for it. It takes a python dictionary as input and generates the PySpark schema. import json from pyspark.sql.types import * def GeneratePySparkSchema ( json ):     fields = []     for key , value in json.items ():         if isinstance ( value , dict ):             field = StructField ( key , GeneratePySparkSchema ( value ), True )         elif isinstance ( value , list ):             if len ( value ) == 0 :                 field = StructField ( key , ArrayType ( StringType ()), True )             elif isinstance ( value [ 0 ], dict ):        

firstworkdate Qlik Equivalent in Spark SQL

Image
 Hi There! I was chatting with a friend and he was facing a problem on a migration project. The old script and processes was built with Qlik, I have never heard of it until now. There the script was using a called a function called as  firstworkdate  The firstworkdate function returns the latest starting date to achieve no_of_workdays (Monday-Friday) ending no later than end_date taking into account any optionally listed holidays. end_date and holiday should be valid dates or timestamps. Here I have excluded the holiday part though. Please suggest if you have anything in mind to implement it. I still think its a very lame solution though but it works. 😀 Here is a proposed solution: ​ select col2 as endDate , reverse ( slice ( reverse ( filter ( transform ( sequence ( date_sub ( col2 , col1 * 2 ) , col2 ) , x - > struct ( x , weekday ( x ) ) ) , x - > x . col2 not in ( 5 ,

Building a Login Flow with .NET MAUI

Image
​ Let's build a Login Flow with .NET MAUI with Shell. Authentication in any mobile app is very common. Lets get started with this. Its obvious that it should ask for login only if it isn't authenticated. We will check for authentication , if not there we will move to Login page if login is success we will move to the Home page. For this example we will override the backbutton pressed event to quit the application but you can customize accordingly as per your need. For this post I am using a simple authentication but you can use JWT or any method you want.  Here is the example of the login flow:       All the pages that has to be used needs  to be registered with the Shell. If you are a bit familier with the Shell navigation the first content page is the one which is displayed after startup. So we need to structure the shell accordingly in order. The pages we are using here for the example: LoadingPage LoginPage HomePage SettingsPage Here is the App

PySpark Schema Generator - A simple tool to generate PySpark schema from JSON data

Image
 Hi Folks, I built a small tool that solves a problem for a data engineer while dealing with JSON data. As we know JSON data is semi-structured and we always ingest them and denormalize them to smaller tables properly for further processing. In my case I had to generate PySpark Schema from JSON to ingest the data and the JSON structure often gets changed. The JSON I was dealing was very complex but let me give you an example about the tool, what problem it solves. For example we have a JSON coming from Kafka like below {   "name": "PREETish ranjan",   "dob": "2022-03-04T18:30:00.000Z",   "status": "active",   "isActive": true,   "id": 102,   "address": {     "city": "Bhubaneswar",     "PIN": 500016   },   "mobiles": ["8989898989", "5656565656"],   "id_cards": [1, 2, 3, 4, 5] } The output i need is like this, StructType([     Str...