Spring Boot with Angular

Hi Readers, In post you will find my source code for my spring boot with angular app at https://github.com/JosePraveen/spring-boot-with-angular. Project Structure Spring Boot App   Angular App Start the angular app using, npm start UI Screens Bike register screen Admin screen Buyer Details Screen You can also bookmark this page for future reference. You can share this page with your …

Continue reading Spring Boot with Angular

All about Apache NiFi

Hi Readers, In this post you will be learning what is apache nifi and when to use it. Apache NiFi supports powerful and scalable directed graphs of data routing, transformation, and system mediation logic. Automate the flow of data between systems eg JSON--> database, Kafka--> ElasticSearch, FTP--> Hadoop, etc. It has drag and drop interface. It …

Continue reading All about Apache NiFi

Optimizing joins techniques in Apache Spark

Hi Readers, In this post you will be learning the various optimizing joins techniques that can be used in Apache Spark. Three types of joins in Spark are Shuffle hash join (default): It is a map-reduce type join. Based on output key it shuffles the datasets. During reduce phase, it joins the datasets for same output …

Continue reading Optimizing joins techniques in Apache Spark

Parallelism techniques in Apache Spark

Hi Readers, In this post you will be learning the various parellelism techniques that can be used in Apache Spark. We need to use parallelism techniques to achieve the full utilization of the cluster capacity. In HDFS, it means that the number of partitions is the same as the number of input splits, which is mostly the same …

Continue reading Parallelism techniques in Apache Spark

File Optimization and Compression techniques in Apache Hive

Hi Readers, In this post you will be learning the various file optimization and compression techniques that can be used in Apache Hive. Hive supports TEXTFILE, SEQUENCEFILE, RCFILE, ORC, and PARQUET file formats. Optimization Techniques: TEXTFILE: This is the default file format for Hive. Data is not compressed in the text file. It can be compressed …

Continue reading File Optimization and Compression techniques in Apache Hive

Optimization techniques in Apache Spark

Hi Readers, In this post you will be learning the various optimization techniques used in apache spark. We can optimize our Spark applications by using data serialization technique, broadcasting etc... Data serialization Spark provides two options for data serialization 1 Java serialization 2 Kryo serialization Compared to Java serialization, Kryo serialization is much fastert than …

Continue reading Optimization techniques in Apache Spark

Run a Scala program in Apache Spark

Hi Readers, In this post you will learn how to Run a Scala program in Apache Spark. In your machine you need to install scala, sbt(build tool), java 8 and spark cluster( aws / use cloudera vm). Open command prompt and type ‘sbt new scala/hello-world.g8’. write your logic in src/main/scala/Main.scala file. To compile/ run / package the scala project open the …

Continue reading Run a Scala program in Apache Spark